Yahoo SEM daily report – Currency and Dups handling

If you are pulling daily keyword performance data from major search engines to store in your in Data Warehouse, you need to perform many of transformation before having a daily, weekly or monthly summaries available for business (end) users.  In the case of Yahoo the required transformations that I needed to make are, one to handle the currency conversion and two to handle Yahoo sending more than one keyword when only one is expected. This has seem to happen when query or landing page url is modified for the same bid unit and for high impressions queries. Also, for disabled queries/keywords.

The following sql handles both with data read from staging table and loaded into final table after some transformation and cleaning. It does currency conversion for all countries except US and Mexico (units are in US dollars) which do not need conversions.

Sub-query below uses MySQL GROUP_CONCAT to find any duplicate keywords for the bid unit and dim_curreny has the daily currency exchange rates pulled from external third party services early in the morning. se_query_group table has the grouping of the keywords based on company requirements and search engine. Query also filters out any keyword that has zero impressions.

The query can further be modified for better performance of using query_id by joining with data warehouse query tables and using query_id in the joins also using search engine bid unit as the surrogate key.

INSERT INTO INT_MKTG_KEYWORD_PERF
(se_date, search_engine, account_id, campaign_id, ad_group_id, keyword_id
,domain, url, query_group_id
,query, query_num_2
,impressions, clicks, ctr
,cpc_orig, cpc
,revenue_orig, revenue
,cost_orig, cost
,num_assist
,avg_position )
SELECT KP.se_date, KP.search_engine, KP.account_id, KP.campaign_id
,KP.ad_group_id, KP.keyword_id,
,KP.domain, KP.url, KP.query_group_id
,CASE
WHEN query2 NOT RLIKE '---' THEN query2 # Normal single kw
WHEN query2 RLIKE '---' THEN substring_index(query2, '---', 1) # Get 1st kw
END query2
,CASE
WHEN query2 RLIKE '---' THEN substring_index(substring_index(query2, '---', 2), '---', -1)
END query_num_2 # Get 2nd query
,impressions
,clicks
,ctr
,cpc cpc_orig
,CASE
WHEN C.country_code not in ('US', 'MX') THEN cpc / C.currency_rate
ELSE cpc
END cpc
,revenue revenue_orig
,CASE
WHEN C.country_code not in ('US', 'MX') THEN revenue / C.currency_rate
ELSE revenue
END revenue
,cost cost_orig
,CASE
WHEN C.country_code not in ('US', 'MX') THEN cost / C.currency_rate
ELSE cost
END cost
,CASE
WHEN avg_position > 0 and avg_position < 1 THEN 1
ELSE avg_position
END avg_position
FROM (
SELECT se_date, search_engine
,account_id, campaign_id, ad_group_id, keyword_id,
,domain, url, query_group_id, query
,GROUP_CONCAT(query SEPARATOR '---') query2
,SUM(impressions) impressions, SUM(clicks) clicks
,AVG(ctr) ctr, AVG(cpc) cpc, 
,AVG(revenue) revenue, 
,SUM(cost) cost, AVG(avg_position) avg_position
FROM stg_keyword_perf
WHERE se_date >= '' #-- Date here
GROUP BY se_date ,account_id ,campaign_id ,ad_group_id ,keyword_id
) KP #- stg_kw_perf
JOIN se_query_group QG
ON KP.query_group_id = QG.query_group_id AND KP.search_engine = QG.search_engine
JOIN dim_currency C
ON KP.se_date = C.currency_date AND QG.country_code = C.country_code
WHERE KP.se_date = '' #-- Date here
AND KP.impressions 0
;

HTH,
Shiva

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s