Yahoo SEM daily report – Currency and Dups handling

If you are pulling daily keyword performance data from major search engines to store in your in Data Warehouse, you need to perform many of transformation before having a daily, weekly or monthly summaries available for business (end) users.  In the case of Yahoo the required transformations that I needed to make are, one to handle the currency conversion and two to handle Yahoo sending more than one keyword when only one is expected. This has seem to happen when query or landing page url is modified for the same bid unit and for high impressions queries. Also, for disabled queries/keywords.

The following sql handles both with data read from staging table and loaded into final table after some transformation and cleaning. It does currency conversion for all countries except US and Mexico (units are in US dollars) which do not need conversions.

Sub-query below uses MySQL GROUP_CONCAT to find any duplicate keywords for the bid unit and dim_curreny has the daily currency exchange rates pulled from external third party services early in the morning. se_query_group table has the grouping of the keywords based on company requirements and search engine. Query also filters out any keyword that has zero impressions.

The query can further be modified for better performance of using query_id by joining with data warehouse query tables and using query_id in the joins also using search engine bid unit as the surrogate key.

(se_date, search_engine, account_id, campaign_id, ad_group_id, keyword_id
,domain, url, query_group_id
,query, query_num_2
,impressions, clicks, ctr
,cpc_orig, cpc
,revenue_orig, revenue
,cost_orig, cost
,avg_position )
SELECT KP.se_date, KP.search_engine, KP.account_id, KP.campaign_id
,KP.ad_group_id, KP.keyword_id,
,KP.domain, KP.url, KP.query_group_id
WHEN query2 NOT RLIKE '---' THEN query2 # Normal single kw
WHEN query2 RLIKE '---' THEN substring_index(query2, '---', 1) # Get 1st kw
END query2
WHEN query2 RLIKE '---' THEN substring_index(substring_index(query2, '---', 2), '---', -1)
END query_num_2 # Get 2nd query
,cpc cpc_orig
WHEN C.country_code not in ('US', 'MX') THEN cpc / C.currency_rate
ELSE cpc
END cpc
,revenue revenue_orig
WHEN C.country_code not in ('US', 'MX') THEN revenue / C.currency_rate
ELSE revenue
END revenue
,cost cost_orig
WHEN C.country_code not in ('US', 'MX') THEN cost / C.currency_rate
ELSE cost
END cost
WHEN avg_position > 0 and avg_position < 1 THEN 1
ELSE avg_position
END avg_position
SELECT se_date, search_engine
,account_id, campaign_id, ad_group_id, keyword_id,
,domain, url, query_group_id, query
,GROUP_CONCAT(query SEPARATOR '---') query2
,SUM(impressions) impressions, SUM(clicks) clicks
,AVG(ctr) ctr, AVG(cpc) cpc, 
,AVG(revenue) revenue, 
,SUM(cost) cost, AVG(avg_position) avg_position
FROM stg_keyword_perf
WHERE se_date >= '' #-- Date here
GROUP BY se_date ,account_id ,campaign_id ,ad_group_id ,keyword_id
) KP #- stg_kw_perf
JOIN se_query_group QG
ON KP.query_group_id = QG.query_group_id AND KP.search_engine = QG.search_engine
JOIN dim_currency C
ON KP.se_date = C.currency_date AND QG.country_code = C.country_code
WHERE KP.se_date = '' #-- Date here
AND KP.impressions 0


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s