Home
Modifiying documents with the Talkwalker API
Contents
1. user name User 2 user email user 20site com access level FULL TOOL y Da d project id project id 3 project name Project 3 account id account id 1 account name account name 1 NIEWEN user id user id 1 user name Admin 1 user email user_1 site com access level ACCOUNT ADMIN y View List curl https api talkwalker com api v2 talkwalker p project id views access token access token gt This endpoint returns a list of all the views in a project Note This endpoint is part of the Talkwalker Project API and needs a read write access token Result status code 0 status message OK request GET api v2 talkwalker p lt project_id gt views access_token lt access_token gt amp pretty true result views projects I id project id gt ciel g Project 1 dashboards Jua S Gall title Dashboard 1 Ds d Joa B ap title Dashboard 2 ls d ant 8 Er title Dashboard 3 js d Za a aana title Dashboard 4 jg Hi etal title Dashboard 5 Ds d AiG s mert title Dashboard 6 y y Talkwalker Channelmonitoring API Channelmonitoring suggest This provides the same functionality as the pagemonitoring suggest in the talkwalker Given a string url name and a type default auto it will provide several candidates Co
2. 1 stream id teststream rules 1 rule id rule 1 query cats H H Streaming Example curl https api talkwalker com api v2 stream s teststream results access_token demo The response is a stream of chunks chunks contain meta data cT conTroL on the Talkwalker stream or search results CT RESULT response chunk type CT CONTROL chunk control timeframe start 1430201017166 timeframe end 1430201040000 stream 1 id teststream status active H chunk type CT RESULT chunk result datas S JC 4 data url http annukcreations blogspot com 2014 12 sunny rings html indexed 1417999367498 search indexed 1417999504832 published 1417999319393 title Color and Light Inspirations in Jewelry SUNNY RINGS content Welcome to my colorful little island This blog is about sharing my colorful world my sources of inspiration and all what fuels my imagination Islands and kitties beauty and art nature and love and creative souls who inspire me Thank you for following me on my journey n nI am an artist and jewelry maker from Turin Italy and I am half Italian and half German I have a background in Language studies and a University degree in German and English but I have always been fascinated by handmade objects art creativity and color This resulted in my passion for handmade jewelry Like many jewelry makers an
3. 404 400 400 400 403 403 403 status code 0 20 21 message OK Internal Server Error Search Execution Exception Parameter Missing Error in query Invalid parameter value Invalid missing or inactive access token Call limit exceeded for this endpoint No credits left API application is inactive No such application linked Linked application inactive or deleted Access denied Insufficient access rights Wrong stream id No such stream defined Invalid operation on document Could not parse json Invalid operation on stream Number of rules to set exceeds maximum number of rules Cannot create any more streams A stream with this name already exists description Default answer An unexpected exception was encountered An unexpected exception was encountered Related to the search Required parameters are missing The missing parameters are provided in key params Could not parse query The details can be found under details A parameter has an unacceptable value The parameter is listed under param and the details under details The access token is either missing or the provided value is invalid The called endpoint has a limited call frequency the values should be cached by the client The account ran out of credits The API account is inactive appId gives the id of that account The provided id is not linked in the API to any project
4. BOSNIA AND HERZEGOVINA BOTSWANA BOUVET ISLAND BRAZIL BRITISH INDIAN OCEAN TERRITORY BRITISH VIRGIN ISLANDS BRUNEI BULGARIA BURKINA FASO BURUNDI CAMBODIA CAMEROON CANADA CAPE VERDE CAYMAN ISLANDS CENTRAL AFRICAN REPUBLIC CHAD CHILE CHINA CHRISTMAS ISLAND COCOS ISLANDS COLOMBIA COMOROS CONGO COOK ISLANDS COSTA RICA bm bt bo bq ba bw bv br vg bn bg bf bi kh cm Ca Cv ky cf td el cn Cx CC co km cg ck cr ITALY JAMAICA JAPAN JERSEY JORDAN KAZAKHSTAN KENYA KIRIBATI KUWAIT KYRGYZSTAN LAOS LATVIA LEBANON LESOTHO LIBERIA LIBYA LIECHTENSTEIN LITHUANIA LUXEMBOURG MACAO MACEDONIA MADAGASCAR MALAWI MALAYSIA MALDIVES MALI MALTA MARSHALL ISLANDS MARTINIQUE MAURITANIA it jm jp je jo kz ke ki kw kg la lv 1b ls br ly H lt lu mo mk mg mw my mv ml mt mh mq mr SAO TOME AND PRINCIPE SAUDI ARABIA SENEGAL SERBIA SERBIA AND MONTENEGRO SEYCHELLES SIERRA LEONE SINGAPORE SINT MAARTEN SLOVAKIA SLOVENIA SOLOMON ISLANDS SOMALIA SOUTH AFRICA SOUTH GEORGIA AND THE SOUTH SANDWICH ISLANDS SOUTH KOREA SOUTH SUDAN SPAIN SRI LANKA SUDAN SURINAME SVALBARD AND JAN MAYEN SWAZILAND SWEDEN SWITZERLAND SYRIA TAIWAN TAJIKISTAN TANZANIA THAILAND st sa sn rs E
5. int httpCode httpConnection getResponseCode getting the correct input stream if httpCode 200 try InputStream is httpConnection getInputStream try readStream httpConnection is resume_ts catch IOException ioe I stream or connection was interrupted retry with next iteration token amp stream_resume else if httpCode 503 the service is currently unavailable int secondsToWait httpConnection getHeaderFieldInt Retry After 60 System out println TEMPORARILY UNAVAILABLE System out println WAITING secondsToWait s UNTIL RETRYING Thread sleep secondsToWait 1000 else when encountering an error we exit loop try InputStream is httpConnection getErrorStream readError httpConnection is httpCode catch IOException e 1 e printStackTrace finally I finished true catch IOException ex I try again ex printStackTrace sleep a minute Thread sleep 60 1000 H deleteStream private void readError HttpURLConnection httpConnection InputStream errorInputStream int httpCode throws IOException 1 ByteArrayOutputStream bos new ByteArrayOutputStream byte dataBuf new byte 1024 1024 read answer while true int read errorInputStream read dataBuf 0 dataBuf length if read 1 1 break bos write dataBuf 0 read InputStream is new ByteArrayInputStream bos toByteArray if httpConnecti
6. pagination next GET api v1 search results access token demo amp g cats amp pretty true amp offset 10 total 298138 Je result content adatas e data I url http annukcreations blogspot com 2014 12 sunny rings html indexed 1417999367498 search indexed 1417999504832 published 1417999319393 title Color and Light Inspirations in Jewelry SUNNY RINGS content Welcome to my colorful little island This blog is about sharing my colorful world my sources of inspiration and all what fuels my imagination Islands and kitties beauty and art nature and love and creative souls who inspire me Thank you for following me on my journey n nI am an artist and jewelry maker from Turin Italy and I am half Italian and half German I have a background in Language studies and a University degree in German and English but I have always been fascinated by handmade objects art creativity and color This resulted in my passion for handmade jewelry Like many jewelry makers and artists my first jewels were made with beads but soon I discovered the potentials of so many materials and I developed my very personal style I would describe myself as a mixed media and eclectic artist My favorite materials include glass polymer clay metal sheets and wood but as I love experimenting the possibilities are endless What I love most about the creative process is the modeling and combining of materials I e
7. E ENEE rev Erde P Te E RU Agog e ders aee dre rain ve aia apd Per Are 36 ESTA query M 37 Talkwalker Query Syntax cian lee e re EE dhe ee herdet dire New an ea pr a uade Hore e eee aca eee eter 38 Special Transformations 4 56 4 ced seu por valens dre dre ne dere de ad od eed md tog ent 38 Boolean Operators ene A S reete e ares neges de SA Kee dence eid e ru dried gua 38 Advanced Search Optlons EEN 40 Url based Search 2s esee ii ure rro E dh cde ie brae Kw er ER RE arene cn 41 Metric Minimum Maximum Restrictions RR mh hes 41 Geographic Restrictions is NEEN ENEE EEN ENEE n ei e d REENEN EEN banene he henne bester re 42 Special Query Lee e EEN 42 uJnre EEUU 43 lutum e REEL 46 VICO e Er 47 he E mn 47 Evolution and stability of document fields 0 cece een nnn n 50 A anena aa i aaa a a aiai a daraan a aa aa aa a aaa aSa ina aia aeaa d liaa aadi aai Saa iaaa 50 Protocols Encodings and Value Field Options ssssseeseeeeeeeeee esee I ns 50 Pr tocols nd EnCodings caus S eercepe eee PO E vetu E dope du EE 51 Evolution of JSON fields isse yere rn rr E E ne ree EN NEE Se ancexu s ecaquc d d edes 51 Value Options 2000 decia A A EE ER Ze 51 APLACCOUDU sans ann g r seaman eee enr peri nee e eder he eir Ale qe ez editus a EON EPI ve ES rater 57 AceessTokensua ees ege nd
8. domain url blog talkwalker com Would return all results of the domain talkwalker com also those from ww talkwalker com while host url blog talkwalker com Would return only results from blog talkwalker com NOt from ww talkwalker com Sentiment Talkwalker uses natural language processing NLP to compute a general sentiment for the documents in our index The accuracy of automatic detection is limited by irony sarcasm and misspellings in the documents Sentiment analysis is available for Language Language Code Language Language Code Albanian sq Hungarian hu Arabic ar Italian it Chinese zh cn zh tw Korean ko Croatian hr Malay ms Czech cs Norwegian no Danish da Polish pl Dutch nl Portuguese pt English en Russian ru Finnish fi Slovak sk Flemish nl Spanish es French fr Swedish sv German de Turkish tr Reach The reach of an article post represents the number of people who were reached by this article post Note that the views only get set to a proper value if the host of the URL is either a domain like theguardian com or if it is a domain with a well known 3rd level subdomain in front mainly applies to www e g www theguardian com Reach is set to 0 for other hosts i e hosts with other 3rd level subdomains like on foobar blogspot com as using the Alexa views of the domain would assign much too high reach to mere sub hosts otherwise Reach is calculated in the following ways Blogs News Sites Forums Number of Page Views Fa
9. 2 published lt 1409529600000 amp stream resume 1406851200000 How to get the documents of the last hour of a Talkwalker project To get the results from the last hour set stream resume to the epoch time one hour i e 3600000 milliseconds ago and stream stop to the most recent time You will get all the documents that have been found during the last hour Note these are the documents that were found during this period timestamp in search indexed the documents were not necessarily published during the last hour thus the set of documents is not equal to the set shown for the last hour in Talkwalker When documents that were published earlier are found and streamed they are added to Talkwalker for the period they were published in curl https api talkwalker com api v2 stream s test p project id results access token access token stream res ume 1420531486000 amp stream stop 1420535086000 How to stream all documents from Talkwalker Page Monitoring The following command creates a stream test used to stream the documents to your application curl XPUT https api talkwalker com api v2 stream create access token access token d streamid test H Content Type application json charset UTF 8 You can then use the test stream to stream all documents from page monitoring by settings topic to page curl https api talkwalker com api v2 stream s test p project id results access token access token gt am
10. Credits 1 credit per returned result at least 10 credits per call e g 100 results 100 credits 10 results 10 credits and 0 results 10 credits Examples Get 100 results containing the words cats and dogs but not birds Set the query cats AND dogs AND NOT birds with query cats 20AND 20dogs 20AND 20N0T 20birds note in URLs spaces are replaced by 20 and set hits per page to 100 with hpp 100 curl https api talkwalker com api v1 search results access token demo amp g cats 2 AND 2 dogs 20AND 2 NOT 2 birds amp hp p 100 amp pretty true More on the Talkwalker Query Syntax Get results containing the word cats sorted from new to old To sort the results by date set sort_by to published to sort by the date of publication to get the newest results first set sort_order desc curl https api talkwalker com api v1 search results access_token demo amp q cats amp sort_by published amp sort_order desc amp p retty true All options for sort by are reach facebook shares facebook likes twitter shares twitter_retweets twitter followers youtube likes youtube dislikes youtube views cluster size comment count published search indexed More on the document fields Get results containing the word dogs published in american blogs curl https api talkwalker com api v1 search results access token demo amp q cats 20AND 20sourcetype BLOG 2QAND 20s ourcecountry us amp pretty true Talkwalker Search H
11. STATES MINOR OUTLYING ISLANDS URUGUAY US VIRGIN ISLANDS UZBEKISTAN VANUATU VATICAN VENEZUELA VIETNAM WALLIS AND FUTUNA WESTERN SAHARA YEMEN ZAMBIA ZIMBABWE cd ti tg tk to tt tn tr tm tc tv ug ua ae uk us um uy vi UZ vu va ve vn wf eh ye zm ZW API Account Access Token Demo To try the Talkwalker API you can use the access token demo access token demo With this token you can try the Search API results and histogram and the streaming API Accessing the Talkwalker API with this token will not return any social media results only results from blogs forums and news are returned this token can be used for testing only Your own Access Token To use the Talkwalker API with the topics from your Talkwalker or to get results from social media Twitter Facebook you need to apply and get your own access tokens read write access tokens are necessary for search channel monitoring updating and deleting documents in a project and for creating streams deleting streams setting panels and setting rules authentication access tokens are necessary when using the Authentication API To get an access token please contact us OAuth 2 0 For an integration of private Talkwalker widgets and data in external applications Talkwalker and the Talkwalker API can authenticate users via OAuth 2 0 Every external application th
12. access through Comment API n lt lt lt lt Continental location of the author Country location of the author Regional location of the author City location of the author Longitudinal location of the author Latitudinal location of the author Resolution of the geo data extraction Name of the author Birthdate of the author Gender of the author Url to the profile of the author For documents which don t include location data these fields are approximated source_attributes wor Lddata continent worlddata country worlddata region worlddata city worlddata longitude worlddata latitude country code resolution id type name birthdate ATTRIBUTES Source Continent Source Country Source Region Source City Source Longitude Source Latitude Source ID Source Type Source Name Source Birthdate Write access through Comment API n Continental location of the source Country location of the source Regional location of the source City location of the source Longitudinal location of the source Latitudinal location of the source Resolution of the geo data extraction source attributes ATTRIBUTES Write access through Comment API gender Source Gender n image url Source Image URL y short name Source Short Name n url Source URL y URL of the source For documents which don t include location data these fields are approximated Evolution and stability of document fields The structure
13. equal to the current time returns the same results as Talkwalker when adding the streamed results to a local database you can group them later by the value in the published field Time zones Timeranges in the Talkwalker application relative to the timezone set up under General Settings Project display options Time zone while the Talkwalker API uses Unix Time Epoch Time in milliseconds no time zones This can make results that are equal appear to be different in the API No maximum of documents in the current month While the Talkwalker application applies a maximum of found documents per month the Talkwalker API returns all documents that can be found for th current given month When the API is used with a Talkwalker project the full project history is available
14. error data curl close ch if I this finished if this wait for retry gt 0 echo SERVICE UNAVAILABLE Mn echo WAITING this wait for retry s UNTIL RETRYING Wn sleep this wait for retry this wait for retry 0 else I sleep 5 60 function read_stream ch data http_status curl_getinfo ch CURLINFO_HTTP_CODE header_size curl_getinfo ch CURLINFO_HEADER_SIZE this unprocessed data this unprocessed data data read the header when it is complete if this gt header size header size this header size header size header complete FALSE else I header_complete TRUE H partial header substr this unprocessed data 0 header size if header complete amp amp this gt header this gt header substr this gt unprocessed_data 0 header_size this gt unprocessed_data substr this gt unprocessed_data header_size if header complete amp amp http status 200 split on r n arr data explode r n this unprocessed data count count arr data for i 0 i lt count i line arr data i try parse json if strlen line gt 0 I json json decode line if json NULL I put it back only if last element if 1 count 1 this unprocessed data line else this finished TRUE this gt handleParseError line else I if is
15. extra author attributes world data id ex annukereations blogspot com 698904645 name view my complete profile gender MALE Je extra source attributes world data continent North America country United States region District of Columbia city Washington D C longitude 77 0094185808 latitude 38 8995493765 country code us lj id ex annukcreations blogspot com name http annukcreations blogspot com d engagement 3 reach 0 DE data url http slshoeicidal wordpress com 2014 12 06 high rez snobbery 715 winter trend ice truncated more on the Talkwalker Search API Talkwalker Streaming API Overview amp Example How it works The Talkwalker Streaming API delivers real time data through a persistent connection to our servers Configure your stream with a set of filtering rules connect to the stream and new results will be delivered in real time as soon as they are found by our crawlers You will not need to do any polling to receive new data You setup and configure the Streaming API by defining rules Boolean query language media types etc The Streaming API then finds and collects all relevant data and adds it to your data stream with individually highlighted snippets per matched rule This feature allows you to gather data from many rules through a single stream while easily matching the results
16. foreach explode r n header as i gt line if i 0 headers http code line else if line list key value explode line headers key value H return headers function createStream name ch curl init stream new stdClass stream streamid name url this url v1 stream create url access token this gt token this gt setCurlOptions ch curl setopt ch CURLOPT URL url curl setopt ch CURLOPT POST 1 curl setopt ch CURLOPT RETURNTRANSFER TRUE headers array Cache Control no cache Pragma no cache Content Language en US curl setopt ch CURLOPT HTTPHEADER headers curl setopt ch CURLOPT POSTFIELDS json encode stream result curl exec ch curl close ch answer json decode result if answer null 84 answer status code 0 I echo result return H echo CREATED STREAM name An return name function deleteStream name ch curl init url this gt url vl stream s name url delete url access token this gt token this gt setCurlOptions ch curl setopt ch CURLOPT URL url curl setopt ch CURLOPT CUSTOMREQUEST DELETE curl setopt ch CURLOPT RETURNTRANSFER TRUE headers array Cache Control no cache Pragma no cache Content Language en US curl setopt ch CU
17. format of the sequence has been changed chunks are delivered in a flat list separated by newline characters vw Each chunk contains a document or stream information Result documents have chunk type CT RESULT CT CONTROL identifies control chunks containing information about the next result chunks and cr ERROR identifies error message chunks Result Chunk chunk type CT RESULT chunk result data I data default result data see simple search highlighted data 1 title snippet title snippet for rule gt content snippet content snippet for rule gt matched rule ad ue stream id stream2 panel id panel1 panel2 rule query cats AND dods if rule id is not set on rule H y Control Chunk chunk type CT CONTROL chunk control timeframe start start time timeframe end stop time y Error Chunk chunk type CT ERROR chunk error status code lt code gt status message lt error message gt data key errdetail value some details Credits Each result independent on how many rules match will be counted as 1 credit If no credits are left the stream is stopped and a control chunk containing the timestamp of the end of the stream needed for resuming is sent API calls which don t return any results are not counted The documents are billed after every co
18. not find panel with id Panelis still referenced HTTP Version Not Supported Url is malformed Could not execute action in Talkwalker Access prohibited Error Handling Streaming API Resuming a disconnected stream description Exceeded the maximum allowed sources whitelist or blacklist for this API account number max is the limit number available how many we can save and number saving the number we tried to save Exception when trying to stream with a stream that has no rule defined A new stream same streamid is connected so the old stream will be disconnected The stream was disconnected due to the given reason The called endpoint was not found Authentication API endpoints need to be called using HTTPS This user id does not exist or is not linked to this project This project can not be accessed with the given access token Too many streams running in parallel for this account A rule with the given id could not be found A panel with the given id could not be found This panel could not be deleted it is still used in a stream The Talkwalker Streaming API supports HTTP 1 1 or newer The given URL for channel monitoring is malformed Error in connecting to a Talkwalker project Access prohibited due to access restriction settings A stream can be disconnected for several reason given maximum of hits max hits reached stream stop reached no credits left server issues or connection problem
19. of main cluster entry Will group identical similar stories from multiple sources together EE Ser TM Meta Cluster Id n Url of main cluster entry Will group identical similar clusters together tags internal Internal Tags 1 n Only in Talkwalker project Tags used internally E g automatically set tags tags manking Internal Tags 2 y Only in Talkwalker project Tags important read checked used internally E g automatically replied set tags tags etis toner Customer Tags y Only in Talkwalker project Tags added by users of Talkwalker tags plugin Plugin tags y Only in Talkwalker project Tags added by plugins in Talkwalker See the chapter on Protocols Encodings and Value Field Options for possible values for the fields sourcetype lang Or country code 1 Can not be changed after creating a new document Must not be null or empty 3 Extracted automatically when left empty Content Talkwalker provides result snippets for all content In all cases the content field only contains the first words of the document in addition we provide the part of the document which matches the query in the content snippet field In the Streaming API a snippet is provided for every matching rule URLs To filter on specific websites in a query the fields domain url and host url can be used host url is used for specific hosts like www talkwalker com OY blog talkwalker com While domain url would filter on all host in a specific domain ie
20. of the documents will not be changed Existing fields will not be removed and their formatting will not be changed Occasionally new fields will be added to the documents and the order of fields can change please take this into account when implementing a custom client Streaming repeated extra entries for each matching rule available in streaming only Extra Fields On streaming this information is present in extra entrydata Field Name Name Write access Comment through API highlighted data Highlighted Data n Content and title snipped of matched rules queries and panels EES Matched n Stream Rule and Panel which were matched rule_id matched rule n ID of matched rule rule_query matched rule n Query of matched rule when id is not set PESCH matched stream n ID of matched stream panel id matched panel n ID of matched Panel matched profile Matched Profile n Profile which matched Gf Talkwalker title snippet Title Snippet n If a match occurred in the title this field will contain the snippet related to the query set in the datafeed content snippet Content Snippet n If a match occurred in the article this field will contain the snippet related to the query set in the datafeed Protocols Encodings and Value Field Options Protocols and Encodings The Talkwalker API uses HTTP protocol 1 1 The Streaming API streams documents using the HTTP 1 1 Chunked transfer encoding mechanism The data is compressed using gzip Accept Encod
21. only replaces exactly one character i e it is useful in consideration of British and American English e g reali ation finds realisation but also realization The tilde symbol analyses the surroundings of a character string which is enclosed in quotes consisting at least two words You cannot combine the tilde with the wildcard operator e g obama merkel 5 finds A statement released from the White House said Obama Monti and Merkel agreed on certain steps 3 jumps between both words obama merkel 5 finds every entry containing the keywords obama and merkel within an interval of maximum of 5 jumps The tilde symbol x after a word searches for words similar to the given word The value after the tilde 0 1 or 2 defines the number of changed characters roam 1 Will also find foam The tilde symbol after a word will find this word as a two part word with a hyphen space or other special character in it carsharing will find carsharing car sharing car sharing etc A simple in front of a keyword samples an exact character string including special characters and punctuation it does not consider lower and upper cases It also works with brackets and tilde 1 or al or d g etc Two in front of a keyword samples an exact character string including special characters and punctuation it does consider lower and upper cases It also works with brackets and tilde L 0r al The NEAR x operator works similar to the
22. or application The linked application is inactive or deleted The used access token does not have enough access rights rights req will list the required access rights rights got lists the access rights provided by that access token A non existing stream was accessed The search document modification operation is not supported reason and details will provide more information The JSON that was passed via POST could not be properly interpreted it was not in the expected format Modifying a stream failed See reason for details Exceeded the maximum allowed rules for this API account number max is the limit number available how many we can save and number saving the number we tried to save Exceeded maximum amount of streams number max The stream streamid is already defined http code 403 403 403 403 404 403 404 403 429 404 404 403 505 400 400 403 status code 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 message Number of sources to set exceeds maximum number of sources Stream has no rules defined Stream got disconnected because newer stream running Stream got disconnected Endpoint or action not found Connection is not secure must use HTTPS User was not found in this application Access to this project is forbidden Limit of maximum concurrent streams reached Could not find rule with id Could
23. talkwalker p project id resources access token lt access token gt curl https api talkwalker com api v2 stream s test p project id results access token access token gt amp topic lt topic id 1 gt amp topic lt topic id 2 gt How to stream all documents from a Talkwalker project for a specific month The following command creates a stream test used to stream the documents to your application curl https api talkwalker com api v2 stream test access token lt access token d H Content Type application json charset UTF 8 You can then use the test stream to stream all documents from August 2014 from your Talkwalker project to your application To get only the documents from August set a query published gt 1406851200000 AND published 1409529600000 to restrict the stream to documents from August and set stream resume 1406851200000 to start the stream on August 1 Set a stream stop time later than the end of August so you get all documents from August also those that were found and streamed later for example use the current time stream stop 1422543275000 Note To get all documents from August do not set stream stop to the end of August Documents that were published in August could have been added to the stream at a later point as we only found them later curl https api talkwalker com api v2 stream s test p project id results access token access token gt amp q publishe d gt 1406851200000 20AND
24. 1390176000000 and max to 1390608000000 to get a histogram of results published between 20 01 2014 and 25 01 2014 with start timestamp included and end timestamp excluded default values curl https api talkwalker com api v1 search histogram published access token demo amp g birds min 139017600000 amp max 13906080000008pretty true The min and max parameters accept timestamps in epoch format milliseconds after 1 1 1970 UTC Get a histogram and statistics over engagement For types different from published and search indexed the histogram API also returns statistics average minimum maximum and sum over every bin curl https api talkwalker com api v1 search histogram engagement access token demo amp q birds amp pretty true response status code Q status message OK request GET api v1 search histogram engagement access_token demo amp q birds amp pretty true result histogram header y Number Results s data euet var 33398900 Te 0 valser count 333989 0 0 80759 0 22 01215608897299 7351818 0 min de bo 3 Ar 8 5 1 uke 8225450 valle S pat count truncated 1 WP a ALO k 740286 0 ent pst counts Tn Talkwalker Search API and Talkwalker Projects https api talkwalker com api v1 search p project id 822531 0 2292253407 822531 0 822531 0 How it works Talkwalker users can use the topics defined in their
25. ANS AKAN ALBANIAN AMHARIC ARABIC ARAGONESE ARMENIAN ASSAMESE AVARIC AVESTAN AYMARA AZERBAIJANI BAMBARA BASHKIR BASQUE BELARUSIAN BENGALI BIHARI BISLAMA BOSNIAN BRETON BULGARIAN BURMESE Results from Mixcloud Results from SoundCloud Results from Vimeo Results from Dailymotion Everything else which does not fit into the above listed categories ab aa af ak sq am ar an hy as av ae ay az bm ba eu be bn bh bi bs br bg my HERERO HINDI HIRI MOTU HUNGARIAN ICELANDIC IDO IGBO INDONESIAN INTERLINGUA INTERLINGUE INUKTITUT INUPIAQ IRISH ITALIAN JAPANESE JAVANESE KANNADA KANURI KASHMIRI KAZAKH KHMER KIKUYU KINYARWANDA KIRGHIZ KOMI hz hi ho hu is io ig id ia ie iu ik ga dit ja jv kn kr ks kk km ki rw ky kv PALI PANJABI PERSIAN POLISH PORTUGUESE PUSHTO QUECHUA RAETO ROMANCE ROMANIAN RUNDI RUSSIAN SAMOAN SANGO SANSKRIT SARDINIAN SCOTTISH GAELIC SERBIAN SHONA SICHUAN YI SINDHI SINHALESE SLOVAK SLOVENIAN SOMALI SOUTHERN SOTHO pi pa fa pl pt ps qu rm ro rn ru sm sg sa SC gd Sr sn sd si sk sl so st CATALAN CHAMORRO CHECHEN CHINESE CHINESE SIMPLIFIED CHINESE TRADITIONAL CHURCH SLAVIC CHUVASH CORNISH CORSICAN CREE CROATIAN CZECH DANISH DIVEH
26. I DUTCH DZONGKHA ENGLISH ESPERANTO ESTONIAN EWE FAROESE FIJIAN FINNISH FRENCH FRISIAN FULAH GALLEGAN GANDA GEORGIAN GERMAN GREEK GREENLANDIC ca ch ce zh zh en zh tw cu cv kw co cr hr CS da dv nl dz en eo et ee fo fj fi fr fy ff gl lg ka de el kl KONGO KOREAN KURDISH KWANYAMA LAO LATIN LATVIAN LIMBURGISH LINGALA LITHUANIAN LUBA KATANGA LUXEMBOURGISH MACEDONIAN MALAGASY MALAY MALAYALAM MALTESE MANX MAORI MARATHI MARSHALLESE MOLDAVIAN MONGOLIAN NAURU NAVAJO NDONGA NEPALI NORTHERN SAMI NORTH NDEBELE NORWEGIAN NORWEGIAN BOKMAL NORWEGIAN NYNORSK NYANJA kg ko ku kj lo la lv li In it lu 1b mk mg ms ml mt 9v mi mr mh mo mn na nv ng ne se nd no nb nn ny SOUTH NDEBELE SPANISH SUNDANESE SWAHILI SWATI SWEDISH TAGALOG TAHITIAN TAJIK TAMIL TATAR TELUGU THAI TIBETAN TIGRINYA TONGA TSONGA TSWANA TURKISH TURKMEN TWI UIGHUR UKRAINIAN URDU UZBEK VENDA VIETNAMESE VOLAPUK WALLOON WELSH WOLOF XHOSA YIDDISH nr es Su SW SS SV tl ty tg ta tt te th bo ti to ts tn tr tk tw ug uk ur uz ve vi vo wa ey WO xh yi GUARANT GUJARATI HAITIAN HAUSA HEBREW Country Options AFGHANISTAN ALAND ISLAND
27. NECTTIMEOUT 30 curl setopt ch CURLOPT TIMEOUT 90 curl setopt ch CURLOPT_FAILONERROR FALSE curl setopt ch CURLOPT HEADER TRUE curl setopt ch CURLOPT USERAGENT PhpExampleClient 1 0 0 curl setopt ch CURLOPT ENCODING gzip public function run streamid project start ts stop ts this resume ts start ts while this finished this unprocessed data this error data this header size 1 this gt header complete FALSE this gt header ch curl_init url this gt url v2 stream s streamid if lempty project url a p project _url results url access token this gt token url amp stream resume this resume ts amp stream stop stop ts curl setopt ch CURLOPT URL url curl setopt ch CURLOPT HTTPGET 1 this gt setCurlOptions ch headers array Cache Control no cache Pragma no cache Content Language en US curl setopt ch CURLOPT HTTPHEADER headers curl setopt ch CURLOPT WRITEFUNCTION array this read stream curl exec ch check if something is in error data check error code http status curl getinfo ch CURLINFO HTTP CODE if curl errno ch 0 amp amp http status 200 this gt finished TRUE else error occurred if http status gt 0 amp amp http status 200 this onStatusError this
28. RLOPT HTTPHEADER headers result curl exec ch curl close ch answer json decode result if answer null amp amp answer status code 0 I echo result return echo DELETED STREAM name An return name Test call method function main url https api talkwalker com api v2 stream s stream id gt p lt project id results access token token start ts time 1000 stop ts time 1000 60 60 1000 example new TalkwalkerApiStreamingClientExample url start ts stop ts example run main gt Java client java package com trendiction api client streamapi streaming2 import java io BufferedReader import java io ByteArrayInputStream import java io ByteArrayOutputStream import java io DataOQutputStream import java io IOException import java io InputStream import java io InputStreamReader import java net HttpURLConnection import java net URL import java net URLConnection import java util HashMap import java util Map import java util concurrent atomic AtomicLong import java util zip GZIPInputStream import org apache commons io IOUtils import org codehaus jackson node JsonNodeFactory import org codehaus jackson node ObjectNode import com fasterxml jackson core JsonFactory import com fasterxml jackson core type TypeReference import com fasterxml jackson databind ObjectMapper import com trendiction co
29. S Article Facebook Shares Article Facebook Likes Article Twitter Retweets Article Twitter Likes Article URL Views Article Pinterest Likes Article Pinterest Pins Article Pinterest Re Pins YouTube Video Views YouTube Video Comments YouTube Video Likes Write access through Comment API y Number of Facebook share an article has y Number of Facebook likes an article has y Number of Twitter retweets an article has y Number of Twitter likes an article has y y Number of Pinterest likes an image has y Number of Pinterest pins an image has y Number of Pinterest re pins an article has y Number of YouTube views a video has y Number of YouTube comments a video has y Number of YouTube likes a video has article extended attr ibutes youtube dislikes instagram likes twitter shares source extended att ributes alexa pageviews facebook followers twitter followers instagram followers pinterest followers article attributes worlddata continent worlddata country worlddata region worlddata city worlddata longitude worlddata latitude country code resolution id type name birthdate gender image url short name url ARTICLE EXTENDED Write access through Comment ATTRIBUTES YouTube Video Dislikes Instagram Image Likes Article Twitter Shares SOURCE EXTENDED Write access through API ATTRIBUTES Alexa Page Views Facebook Followers Twitter Followers Instagram Followers P
30. S SC sl 5g SX Sk si sb so Za gs kr SS es 1k sd sr sj SZ se ch sy tw tj tz th COTE DIVOIRE CROATIA CUBA CURACAO CYPRUS CZECH REPUBLIC DENMARK DJIBOUTI DOMINICA DOMINICAN REPUBLIC ECUADOR EGYPT EL SALVADOR EQUATORIAL GUINEA ERITREA ESTONIA ETHIOPIA FALKLAND ISLANDS FAROE ISLANDS FIJI FINLAND FRANCE FRENCH GUIANA FRENCH POLYNESIA FRENCH SOUTHERN TERRITORIES GABON GAMBIA GEORGIA GERMANY GHANA ci hr CU cw cy Cz dk dj dm do ec eg sv er ee et fk fo fj fi fr gf pf tf ga gm ge de gh MAURITIUS MAYOTTE MEXICO MICRONESIA MOLDOVA MONACO MONGOLIA MONTENEGRO MONTSERRAT MOROCCO MOZAMBIQUE MYANMAR NAMIBIA NAURU NEPAL NETHERLANDS NETHERLANDS ANTILLES NEW CALEDONIA NEW ZEALAND NICARAGUA NIGER NIGERIA NIUE NORFOLK ISLAND NORTHERN MARIANA ISLANDS NORTH KOREA NORWAY OMAN PAKISTAN PALAU mu yt mx fm md mc mn me ms ma mz mm na nr np nl an nc nz ni ne ng nu nf mp kp no om pk pw THE DEMOCRATIC REPUBLIC OF CONGO TIMOR LESTE TOGO TOKELAU TONGA TRINIDAD AND TOBAGO TUNISIA TURKEY TURKMENISTAN TURKS AND CAICOS ISLANDS TUVALU UGANDA UKRAINE UNITED ARAB EMIRATES UNITED KINGDOM UNITED STATES UNITED
31. S ALBANIA ALGERIA AMERICAN SAMOA ANDORRA ANGOLA ANGUILLA ANTARCTICA ANTIGUA AND BARBUDA ARGENTINA ARMENIA ARUBA AUSTRALIA AUSTRIA AZERBAIJAN BAHAMAS BAHRAIN BANGLADESH BARBADOS BELARUS BELGIUM BELIZE BENIN gn gu ht ha he af ax al dz as ad ao ai aq ag ar am aw au at az bs bh bd bb by be bz bj OCCITAN OJIBWA ORIYA OROMO OSSETIAN GIBRALTAR GREECE GREENLAND GRENADA GUADELOUPE GUAM GUATEMALA GUERNSEY GUINEA GUINEA BISSAU GUYANA HAITI HEARD ISLAND AND MCDONALD ISLANDS HONDURAS HONG KONG HUNGARY ICELAND INDIA INDONESIA IRAN IRAQ IRELAND ISLE OF MAN ISRAEL oc or om os gi gr gl gd 9p gu gt 99 gn gw gy ht hm hn hk hu is in id de iq ie im il YORUBA ZHUANG ZULU PALESTINE PANAMA PAPUA NEW GUINEA PARAGUAY PERU PHILIPPINES PITCAIRN POLAND PORTUGAL PUERTO RICO QATAR REUNION ROMANIA RUSSIA RWANDA SAINT BARTHELEMY SAINT HELENA SAINT KITTS AND NEVIS SAINT LUCIA SAINT MARTIN SAINT PIERRE AND MIQUELON SAINT VINCENT AND THE GRENADINES SAMOA SAN MARINO yo zd zu ps pa pg py pe ph pn pl pt pr qa re ro ru rw bl sh kn Te mf pm vc WS sm BERMUDA BHUTAN BOLIVIA BONAIRE SINT EUSTASIUS AND SABA
32. Talkwalker API Talkwalker 16 Avenue Monterey L 2163 Luxembourg Updated November 2015 Table of Contents Talkwalker API Overview X 1 Talkwalker Search API Overview amp Example seeeeeeeeeeeee heh hmm 1 Talkwalker Streaming API Overview amp Example 3 Talkwalker Search API iii ri ce ee abe E E EEEE E and Sukie E EA E EN EE caco 6 Talkwalker Search Results APT cinc a o Free eia eia eiaa lah eR Ie ae tara i d 6 Talkwalker Search Histogram API ANEN ENN ENN RENE ee x au e v Er REX ERR wed Y ERRAT a Ed 7 Talkwalker Search API and Talkwalker Projects 0 0 cece eee eI n 12 Modifiying documents with the Talkwalker An 15 Talkwalker a E EE 17 lU 17 Seen 17 IUE ECHO HUE 21 Matching of Streams Rules and Panels onere enr hv Per en Aerer bes ve RR sed eb ea des 27 Quota on Streams eoe umo eo exci BEEF C Nate eee are eed e eda mice E Y ra EA EE E Ga oe Haine ge S d ES 28 Temporarily Disable Strearmis AA is SEET ess Vp ac ee aig wh oe Rer ctim ue E Ul Rodi EE e dt 30 Talkwalker Streaming API and Talkwalker Drolects I n 30 Talkwalker Single TEE 31 SOUT CO PR Mn 31 Talkwalker Login Url 4221 25 m bmi Onb Er rae di se Geek des p Vra dcdit e d a sad 31 Iro eoeine 32 CESA EN 32 Project Distinciones AAA aia ra Gee wares 34 KUTTEN 36 Channelmonitoring Suggest N
33. Twitter share an article has Number of Twitter followers a source has Number of YouTube views a video has Number of YouTube likes a video has Number of YouTube dislikes a video has Number of Comments an article has Parameters parameter description required access_token a read write token specified in the required API application q The query to search for required avin Minimum value for bins optional mae Maximum value for bins optional mn artes Include min value optional DESEN Include max value optional id Bin Interval optional ER Time zone for interval optional allowed values Talkwalker query syntax Long Integer value Long Integer value true false true false Long Integer duration for published and search_indexed tz database timezone name i e Europe Luxembourg default value true false dynamic UTC Possible values for interval when creating a histogram over published OT search indexed year quarter month week day hour minute second as well as numeric values with the units w week d day h hours n minutes and s seconds e g 5d for 5 days or 2w for 2 weeks The maximum number of histogram bins is 400 if the min max and interval parameters result in a larger number of bins an error message HTTP 400 is returned Try reducing the range or increasing the interval Credits 10 credits per call Examples Get a histogram over the last 8 days of online news results co
34. agram followers 100 comment count 0 published gt 1420731027000 searchindexed gt 1420731027000 sample 25 sample million 2000 sentiment gt 0 sentiment negative Note Some documents have precise geographic data in form of GPS measured coordinates provided by the source For other documents this data is based on source metadata with a certain precision level These levels ordered from lowest precision to highest are country region and city extracted data and coordinates exact data The coordinates for lower precision geographic data are equal to their capital Restriction sourcegeo sourcegeo_resolution Description Restricts the results to a rectangular geographic area defined by the coordinates latitude longitude of the upper left and lower right corner Restricts to documents that have a minimum precision level of location data Possible levels are coordinates city region and country default all documents Example sourcegeo 50 3 5 7 49 4 6 5 sourcegeo resolution coordin ates Example Search for documents that are in a box that roughly corresponds to Luxembourg and have exact coordinates Luxembourg s north end is at around 50 3 south is at 49 4 west at 5 7 and east at 6 5 the upper left corner is 50 3 5 7 the lower right corner is 49 4 6 5 The final query is sourcegeo 50 3 5 7 49 4 6 5 AND sourcegeo resolution coordinates Special Query Modifiers All queries are executed
35. aker query syntax for each application curl XPUT https api talkwalker com api v2 stream s teststream access token access token d rules rule id rule 1 query foo rule_id rule 2 query bar The returned results will be in the format below The documents can be separated using matched query which indicates which rule the result belongs to d chunk type CT RESULT chunk result datae data default result data see simple search highlighted data 1 matched rule id rule 1 d title snippet lt title snippet for rule gt content snippet content snippet for rule gt y How to get the number of results grouped by media types The Talkwalker API provides only documents and histograms to group results into custom sets you have to get all the results and then compute those sets locally Alternatively you can perform separate searches or histograms for each of the groups you want to create use the Talkwalker query syntax to restrict the results to those matching a single group How to get the ids of Talkwalker Topics To get a list of the search topics defined in a Talkwalker project use the project id and the access token on the https api talkwalker com api v2 talkwalker p lt project id resources endpoint with the filter type search curl https api talkwalker com api v2 talkwalker p project id resources access token acce
36. am ids rule ids and panel ids all must be unique within the project Rate Limit This endpoint is limited to 5 calls per minute Only one connection can be opened if multiple streams were defined they must be streamed through one single connection see above how to select multiple streams Managing Streams Stream Create and Stream Definition Creating a new Stream and getting the definition of a stream are done on the https api talkwalker com api v2 stream s streamid endpoint using the methods PUT and GET Parameters Endpoint parameters parameter description required default value arose Nake a read write token specified in the API application required Example create a new stream stream id teststream rules 1 rule id rule 1 query cats H Command curl XPUT https api talkwalker com api v2 stream s teststream access token demo d rules 1 rule id rule 1 query cats H Content Type application json charset UTF 8 Response status code Q status message OK request PUT api v2 stream s teststream access_token demo result stream data I stream id teststream SUI S El rule id rule 1 query cats H H Example get the stream teststream curl XGET https api talkwalker com api v2 stream s teststream access token demo The response will be the same as before Rate Limit This endpoint is limited to 20
37. an be used with the Streaming Results API To limit the results of a predefined stream to those matching a topic topic to that topic s ID multiple topics can be set see Talkwalker Search API and Talkwalker Projects Example Setup a stream that streams all new data for a Talkwalker Project You will need your custom API application access token To find the Id of your project use curl https api talkwalker com api v1 search info access token access token gt To get a list of all topics curl https api talkwalker com api v1 search p project id topics list access token access token gt To create the stream curl XPUT https api talkwalker com api v2 stream s teststream access token access token d streamid teststream H Content Type application json charset UTF 8 To start the stream curl https api talkwalker com api v2 stream s teststream p lt project_id gt results access_token lt access_token gt amp topic lt topic_id_1 gt amp topic lt topic_id_2 gt See FAQ for more examples Talkwalker Single Sign on API Source https api talkwalker com api v2 auth Note The Single Sign on API needs a special access token of type authentication and the endpoints must be called via a secure connection HTTPS Talkwalker Login Url curl https api talkwalker com api v2 auth u lt user id loginurl access token access token gt The Talkwalker Single Sign on API is used to retrieve a
38. at wants to use such data needs OAuth 2 0 credentials for Talkwalker a client id and client secret and needs to provide a redirect URL To ask for permission to access private data the external application redirects the user to http www talkwalker com app oauth authorize client id client id gt amp response type code redirect uri redirect uri encoded gt amp scope projects After the user has granted permission he will be redirected to the redirect URL provided by the external application This redirect will include a query string with a access code parameter code access code To get the actual OAuth access token for a user the external application makes a POST request to http www talkwalker com app oauth access token client id client id gt amp client secret client secret amp grant type authorization code redirect uri re direct uri encoded gt amp code lt authorization code with the header Content Type application x www form urlencoded The Talkwalker server will respond with a body of the following form access token oauth access token The external application can now use the OAuth access token instead of a Talkwalker API access token Instead of setting the query string field access token the requests must contain the header field Authorization tO Bearer oauth access token for more information about OAuth 2 0 see http oauth net 2 OAuth 2 0 Setup To get a client id and a client secret pleas
39. ated or deleted rule Example Add a rule to limit a stream to only German results curl XPUT https api talkwalker com api v2 stream s teststream r rule 1 access token demo d query lang de y H Content Type application json charset UTF 8 Response d status code 0 status message OK request PUT api v2 stream s teststream r rule 1 access_token demo result stream data I stream id teststream rules 1 rule id rule 1 query lang de li D Get an existing rule curl XGET https api talkwalker com api v2 stream s teststream r rule 1 access token demo Delete an existing rule curl XDELETE https api talkwalker com api v2 stream s teststream r rule 1 access token demo Rules that are not in valid Talkwalker query syntax will be rejected error 400 4 Error in query in this case the old rules will not be replaced Parameters Endpoint parameters parameter description required default value sees en a read write token specified in the API application required Rate Limit This endpoint is limited to 20 calls per minute Panels The Panel defines a source set that is considered for streaming It can contain a whitelist with an include query include query or a blacklist with exclude query exclude query To create get or delete a panel use the https api talkwalker com api v2 panel a panel id endpoint Panels are defined using the Talkwalker qu
40. back to your predefined rules Each rule allows filtering by title content author language URL country media type and more parameters using the same syntax as in our Talkwalker Search interface You can also apply a list of sources to be included or excluded from the stream to give you even further possibilities to narrow down the results you will get A single rule can support up to 50 operands To create complex rules operands may be combined using Boolean Operators The documents are streamed in the order they are found by our crawlers and added to Talkwalker i e by search indexed timestamp Custom sorting is not possible with the Streaming API however this can be done with the Search API The documents are grouped in timeframes which contain all documents that were indexed between the given start and end time of the timeframe Each result independent on how many rules match will be counted as 1 credit A brief example Streaming The Talkwalker API streaming endpoint https api talkwalker com api v2 stream is used to stream results from Talkwalker Creating a Stream Command curl XPUT https api talkwalker com api v2 stream s teststream access_token demo d rules rule id rule 1 query cats H Content Type application json charset UTF 8 Response status code 0 status_message OK request PUT api v2 stream s teststream access_token demo result stream data
41. calls per minute Stream Delete The https api talkwalker com api v2 stream s stream id endpoint is used to delete a stream Example curl XDELETE https api talkwalker com api v2 stream s teststream access token demo amp pretty true Parameters parameter description required default value REGER token a read write token specified in the API application required Rate Limit This endpoint is limited to 20 calls per minute Stream Info The https api talkwalker com api v2 stream info endpoint returns a list of all Talkwalker API Streams linked to a Talkwalker API access token Example curl https api talkwalker com api v2 stream info access token demo pretty true Response d status code 0 status message OK request GET api v2 stream info access token demo result streaminfo data I name teststream il Parameters Endpoint parameters parameter description required default value access token a read write token specified in the API application required Rate Limit This endpoint is limited to 20 calls per minute the result should be stored Rules The https api talkwalker com api v2 stream s stream id r rule id resource is used to set new rules for an existing stream Rules are used to filter out unwanted results on a stream Talkwalker Streaming API rules are specified in the Talkwalker query syntax The response only includes the requested cre
42. cebook The Number of Fans of the Page Note Only available for public pages which are monitored by Talkwalker we don t collect any fan counts for user profiles Twitter The number of Followers of the author Images Optional images MEDIA_ENTRY Write access Comment through API unt Image Url y Link to Image width Image Width y Width of image if available etate Image Height y Height of image if available legend Image legend y Legend text Videos Optional videos url width height legend Attributes MEDIA ENTRY Video Url Video Width Video Height Video legend Write access through API y y y y These fields are only set for certain post types Comment Link to Image Width of image if available Height of image if available Legend text of video Article extended attributes fields will be updated for up to 1 month The source extended attributes represent the exact value at publication Not all urls will have all meta data e g e Blog news and messageboard posts not their comments will only have facebook_shares twitter_shares set All the other types will only be set if the sourcetype is of the same type and if the data is available article extended attr ARTICLE EXTENDED ibutes facebook shares facebook likes twitter retweets twitter likes url views pinterest likes pinterest pins pinterest repins youtube views youtube comments youtube likes ATTRIBUTE
43. ch Wildcard Search one character Proximity Search Fuzzy X Search Fuzzy Search Raw Data Search Exact Raw Data Search NEAR x ONEAR x AND combines two keywords BMW AND bike will find all entries which mention the keyword BMW and the keyword bike AND NOT excludes a word of an entry pu AND NOT bike will find all entries with the keyword BMW but only if the notion bike is not contained in the same article or means that a least one of the terms which are linked by an OR have to be mentioned in the same article Buw oR bike will find all entries that include either the keyword BMW or the keyword bike Negative filters can be created by using the operator Nor Quotes are used for finding keyword sequences BMW series will find all entries which contain the phrase BMW series In contrast the search query BMW AND series does not respect the order Brackets are used to group several keywords in a way that operators can be applied on multiple terms within the brackets distributive law Bmw AND motorcycle OR car is a shortform for BMW AND motorcycle OR BMW AND car The Wildcard operator is a character that stands for 0 or any possible number Wildcards are only accepted at the end of a keyword Luxemb will find all entries including keyworks like Luxembourg Luxemburg Luxemburgisch or any other keyword with the prefix Luxemb The question mark has a similar function as the wildcard operator but
44. d curl https api talkwalker com api v2 stream s teststream results access token demo amp stream resume 1388534400000 Parameters parameter description required default value BEES LON a read write token specified in the API application required q The query to search for optional stream resume Resumes the stream from this starting point optional now stream lt stop Stops the stream at this point optional deis Stops the stream after the given number of hits optional stream stop Can be used to specify an end timestamp for the stream When the number of documents in max hits is reached the remaining documents of the timeframe are still streamed but not billed After this a control chunk containing the timestamp needed to resume the stream is send Multiple stream ids To stream results of multiple streams through one single connection all of the streaming endpoints accept multiple streams in the s stream id parameter The following syntax can be used example description single sen a single stream multiple t St1 test2 test3 a list of streams prefix Ir every stream that starts with test all all defined streams test test1 exclude every stream that stats with test except test1 While streaming the matched streams are expanded on the start of every chunk so that new streams get picked up automatically on a running connection Streaming will fail in case no stream matches the multiple streams description anymore Stre
45. d and timezone Its usage depends on period period reference time explanation hourly beginning of hour reference hour minute in hour dandy beginning of day reference day hour in day Weekly beginning of week reference week day of week money beginning of month reference month day of month Request information about a quota on a stream Example curl XGET https api talkwalker com api v2 stream s teststream quota access token demo Response status code 0 status message OK request PUT api v2 stream s teststream quota access token demo result stream data 1 stream id teststream quota allowance 10000 reset hourly timezone UTC period start 2015 04 27108 00 00 000Z period reset 2015 04 27T109 00 00 0007 usage 0 status active reference time 2015 01 01T00 00 00 0007 H To remove the quote from a stream Example curl XDELETE https api talkwalker com api v2 stream s teststream quota access token demo A Reset can also be triggered manually if a rule should be reactivated the usage will then be reset to 0 for the current period curl XPOST https api talkwalker com api v2 stream s streamid quota reset access token demo If the quota on a stream gets full before the end of a chunk the data for the current chunk is still fully delivered Reactivation of a stream occurs at chunk boundaries Chunk boundarie
46. d artists my first jewels were made with beads but soon I discovered the potentials of so many materials and I developed my very personal style I would describe myself as a mixed media and eclectic artist My favorite materials include glass polymer clay metal sheets and wood but as I love experimenting the possibilities are endless What I love most about the creative process is the modeling and combining of materials I especially make rings and pendants but you will find some pins and earrings as well All my pieces are one of a kind so no two pieces are the same I love traveling and much of my work reflects the memories of places I love I also like to bring back from my trips beautiful and unique glass and ceramic beads and cabochons and found pieces such as ceramic shards and beach pottery to incorporate in my work or use as focal pieces In recent years title snippet Color and Light Inspirations in Jewelry SUNNY RINGS root url http annukcreations blogspot com domain url http blogspot com host url http annukcreations blogspot com parent url http annukcreations blogspot com 2014 12 sunny rings html langes ens truncated more on the Talkwalker Streaming API Talkwalker Search API Talkwalker Search Results API https api talkwalker com api v1 search results How it works The Talkwalker Search API allows you to retrieve up to 500 sorted results for a given timeframe
47. d will contain the snippet related to the query set in the datafeed On streaming this information is present in extra entrydata Url of the subsection of the site where article was posted on Url of the domain where article was posted on Url of the host where article was posted on Url of the parent of the article This is the post this urlis refering to e g in case of a comment the main article in case of a message board post the main post in the thread The language of the article Statistical Calculation of the pornographic Level Statistical Calculation of the fluency level Data Range 0 100 The Fluency Level of an article if low ifthe article is composed of stacked words without punctuation marks Spam level of the source Sentiment of text Negative neutral or positive Source type of the post Source Possible field values Example www zeit de blogs Example zeit de Example www zeit de 0 100 100 Pornographic Content 0 100 100 Normal Text 0 100 100 Spam gt 50 can be considered as spam 5 4 3 2 1 0 1 2 3 4 5 5 being negative and 5 being positive ONLINENEWS BLOG type can be any string and be user default OTHER defined field name name Write access Comment Possible field values through API past type Post Type y Type of the post If it s a text post default TEXT an image post video post or anything else cluster nd Cluster Id n Url
48. e contact us You will have to provide one or more redirect uri for development purposes localhost is allowed Credits Pricing Monthly Reset of Credits The credits will be reset every month on the day of the subscription at 03 00 UTC Note that the monthly new results in Talkwalker projects are reset on the first of a new month at 0 00 UTC Remaining Credits Endpoint The endpoint https api talkwalker com api v1 status credits is used to get an overview of consumed credits and API calls Response status code 0 status message OK request GET api v1 status credits access_token demo result creditinfo used credits monthly 0 used credits onetime 0 remaining credits monthly 0 remaining credits onetime 0 next billing period 1419634800000 estimate credits used until end of billing period 0 monthly total 0 Rate Limit This endpoint is limited to 10 calls per minute the result should be stored FAQ How to stream all documents from a Talkwalker project The following command creates a stream test used to stream the documents to your application curl XPUT https api talkwalker com api v2 stream s test access token access token d H Content Type application json charset UTF 8 You can then use the test stream to stream all documents in real time from your Talkwalker project to your application This will return in real time all new
49. eamControl Map lt String Object gt controlChunk System out println CONTROL controlChunk protected void handleStreamResult Map String Object resultChunk Map lt String Object resultData getAsMap resultChunk data Map lt String Object entryData getAsMap resultData data String url getAsT entryData url String class System out println RESULT url protected void unhandledStreamChunk Map String Object unhandledChunk System out println UNHANDLED unhandledChunk protected void createStream throws IOException I String url url v1 stream create access_token token connect URL request new URL url URLConnection connection request openConnection connection setConnectT imeout 30000 connection setReadTimeout 90000 HttpURLConnection httpConnection HttpURLConnection connection httpConnection setRequestMethod POST httpConnection setRequestProperty User Agent JavaExampleClient 1 0 0 httpConnection setRequestProperty charset utf 8 httpConnection setDoOutput true httpConnection setDoInput true connection setUseCaches false connection setRequestProperty Content Language en US Data0utputStream wr new Data0utputStream connection get0utputStream JsonNodeFactory factory JsonNodeFactory instance ObjectNode on factory objectNode on put streamid stream_id System out println on toString wr wr
50. ect id 3 project name Project 3 account id account id 1 account name account name 1 access level ACCOUNT ADMIN i Ya 4 user name User 2 user email user_2 site com user id user id 2 project project id project id 2 project name Project 2 account id account id 1 account name account name 1 access level FULL TOOL Project List curl https api talkwalker com api v2 auth projects access token access token gt This endpoint returns a list of all the projects in an account and the users that have access Example https api talkwalker com api v2 auth projects access token lt access token pretty true Result status code 0 status message OK request GET api v2 auth projects access_token lt access_token gt amp pretty true result projects project project id project id 1 project name Project 1 account id account id 1 account name account name 1 user Wseneidsceusemeldels user name Admin 1 user email user_1 site com access level ACCOUNT ADMIN Fl jo d project id project id 2 project name Project 2 account id account id 1 account name account name 1 user suseni nda ese meld user name Admin 1 user email user_1 site com access level ACCOUNT ADMIN Ip al user id user id 2
51. el2 Jod panel id panel3 tel Parameters Endpoint parameters parameter description required default value BESSE taken a read write token specified in the API application required Rate Limit This endpoint is limited to 20 calls per minute Matching of Streams Rules and Panels When a document matches a rule highlighted data is included in the result entry When multiple rules match a query highlight data is repeated for every rule that matches Example highlighted data matched rule id rule 1 stream id stream 1 panel id panel 1 panel 2 rule query cats OR dogs if rule id is not set on rule title_snippet Cats are content_snippet a ches Gite oa 4 Quota on Streams A quota can be specified for each stream This quota allows to limit the number of results delivered through a stream per hour day or month After the limit has been reached this stream will be deactivated until the next period begins The connection will stay open even if the stream some of the streams or all streams are deactivated Information about disabled streams is delivered through periodic control chunks Example curl XPUT https api talkwalker com api v2 stream s teststream quota access token demo d allowance 1000 reset daily timezone UTC reference_time 2015 01 01100 00 00 000Z or long The reference time defines a reference time in relation to the perio
52. ery syntax Example Create a the panel with include query lang de lang fr and exclude query sourcecountry lu for the stream teststream tO restrict the stream to German and French results which are not from Luxembourg curl XPUT https api talkwalker com api v2 panel a testpanel access token demo d d include query lang de lang fr Me exclude query sourcecountry lu H H Content Type application json charset UTF 8 Response status code 0 status message OK request PUT api v2 panel a testpanel access token demo result panel data 1 panel id testpanel include query lang de lang fr Ile exclude query sourcecountry Lu H Getting a panel curl XGET https api talkwalker com api v2 panel a testpanel access token demo Deleting a panel Panels that are still referenced may not be deleted curl XDELETE https api talkwalker com api v2 panel a testpanel access token demo Panels that are not in valid Talkwalker query syntax will be rejected error 400 4 Error in query in this case the old panels will not be replaced Getting a list of all panels curl XGET https api talkwalker com api v2 panel info access token demo Response status code Q status message OK request GET api v2 panel info access token demo result panel data panel id panel1 dod panel id pan
53. ete eR e Fa a deed a deter A loop dde efe d Sara raf ex 57 OAU 20 50 EET 57 Credits Priting iiw di fans SMS EE A EE A ERE E 58 FAQ uredd er Rat rt eer urere SE UR Ed ge reg e rds e RAV GERE od d Red RC See eek CAE aed eek VH 58 How to stream all documents from a Talkwalker project 0 eee I 58 How to stream all documents from a Talkwalker project for a specific month 61 How to get the documents of the last hour of a Talkwalker project seeeeeeeeeeee tee eens 62 How to stream all documents from Talkwalker Page Monitoring 62 How to eliminate retweets or comments from a stream 62 How to get only documents of a Talkwalker project that include special keywords 62 How to use a single stream for multiple applications Tcliente 0 cece eee eee erre 63 How to get the number of results grouped by media types e 63 How to get the ids of Talkwalker Topics cesses hh ha e 3 9 etn 64 Code Examples xx A tU e IER E RE E E YUAN UA Cn Er Edu ree peso eg res 65 Streaming Client Ex mples iscbni bei AER anden deras REIN C ES REUNIR EM RU 65 Throubleshooting ce ccc serve s nne Renee ERN HIE Nee e OE A ES USE SEE Ie Y ned 77 Error Codes EE 77 Error Handling oe Rm A Re E URP EVE UM CAD dr aid EEN aie e eee teo e ie ees 79 Talkwalker API Overview Talkwalker Search API Overview amp Example How it works The Talkwalker Search API allows you to retrieve up to 500 sorted results for a given timeframe withi
54. etty true result loginurl single sign on url app login login token lt token gt amp user id user id gt user id user id gt expiration date 1423064059056 Logout curl https api talkwalker com api v2 auth u lt user id logout access token access token gt The logout endpoint is used to log a user out from talkwalker and to invalidate all tokens that were created for this user All sessions for this user either authenticated with a single sign on URL or with a password will be closed User List curl https api talkwalker com api v2 auth users access token lt access token gt This endpoint returns a list of all the users in an account and the projects they have access to Example https api talkwalker com api v2 auth users access token access token amp pretty true Result status code Q status message OK request GET api v2 auth users access token lt access token gt amp pretty true result users tur lee user_name Admin 1 user email user_1 site com sser ada essendi project project id project id 1 project name Project 1 account id account id 1 account name account name 1 access level ACCOUNT ADMIN Pg d project id project id 2 project name Project 2 account id account id 1 account name account name 1 access level ACCOUNT ADMIN ls 3 project id proj
55. exa pageviews 60438000000 Je extra article attributes world data D extra author attributes world data id fb 100007373088511 name S bu Dlokweni gender UNKNOWN image url https graph facebook com 100007373088511 picture url http www facebook com profile php id 100007373088511 To extra_source_attributes world_data continent Africa country South Africa region Orange Free State city Bloemfontein longitude 26 2299128812 latitude 29 1199938774 country code za Dr engagement 1 reach 0 It consists of CT DATA the data entries and CT CONTROL the control entries One example CT CONTROL stream is shown below chunk type CT CONTROL chunk control timeframe start 1409906135111 timeframe end 1409906205401 In this case all results from 1409906135111 to 1409906205401 will be streamed to the application In case of disconnection e g connection issue application got restarted you can provide the latest timeframe start asa starting point as a value for the parameter stream resume curl https api talkwalker com api v2 stream s test p project id results access token access token stream res ume 1409906135111 Below command returns the list of topics which can then be used to only stream a certain topic and not all topics curl https api talkwalker com api v2
56. from the domain Spiegel de Pay attention not to insert www into the query site twitter com bmw returns all documents from the site twitter com bmw site googleblog blogspot com returns documents from googleblog blogspot com Pay attention to end with a if the site includes a specific path bm but not if it ends with the top level domain com inurls facebook returns all documents which have the keyword facebook anywhere in their url or which have it in any referenced url in the content Metric Minimum Maximum Restrictions metric name n metric name n and metric name n return only documents which match a specific value or range of a metric Following tables explains the possible metrics metric name reach engagement facebook shares facebook likes twitter retweets twitter shares Description The reach of an article post represents the number of people who were reached by this article post is important is read is checked score 4 posttype LINK url http twitter com bmw status 56192586115556 1473 parenturl http twitt er com bmw status 56192586115556 1473 hosturl http www sp iegel de domainurl http spie gel de site googleblog blogsp ot com site blogspot com site twitter com bmw inurls facebook Example reach gt 100 The engagement of an article post is the sum of actions made by others 9agement lt 1000 on that article post Number of Facebook
57. he Talkwalker API can also be used with OAuth 2 0 authentication see the chapter on access_tokens and OAuth 2 0 Rate Limit This endpoint is limited to 20 calls per minute the result should be stored Get search results and histograms for topics The Project Search Result API https api talkwalker com api v1 search p lt project_id gt results and the the Project Search Histogram API https api talkwalker com api v1 search p lt project_id gt histogram can be used with the same parameters as the normal Search Result API and the Search Histogram API Additionally to search a specific topic of a Talkwalker Project set the parameter topic to one or more topic IDs Modifiying documents with the Talkwalker API Single Documents To change result documents use the https api talkwalker com api v2 search p project id operation endpoint Creating new documents can be done on the create operation updating documents is done with the update operation Deletion and un deletion of documents can be done on the delete and undelete operations respectively The fields url published and content are required When left empty some fields for example sourcetype posttype and language will be filled automatically with default values or automatically extracted values Examples Create curl XPOST https api talkwalker com api v2 docs p project id create access token access token d Y url http www example com doc
58. he title of an article title sixt will find all results which title sixt M title obama merkel 5 contain the keyword sixt within the title title obama merkel 5 matches with Obama Seeking Ally Finds Merkel a Tough Sell Content Search It searches within the article content sixt will find all results which mention the ontent sixt keyword within the main text of the article Author Search It searches for authors of articles author Franz will find all results containing See articles which defined Franz as author Language Search It searches for languages of articles 1ang de only indicates German results Kl Source Country It searches for the country of origin of sources sourcecountry de filters all articles 9 rcecountry de Restriction from German sources and which were published in Germany Author Country If the author is in a specifiy country when writing posts authorcountry fr limits 9Uthorcountry de Restriction results to ones from French authors Source Type sourcetype BLOG restricts results to a specific media source type Returns only ourcetype BLOS Restriction BLOG entries Comments Search Find only comments by setting is coment or without comments is comment Tse is retweet Retweets Search Find only retweets with is retueet or exclude retweets with is retweet and get only original posts Twitter Reply Find only tweets that are replies to other tweets Gre Search Questions Search Search f
59. in their unaccented and case insensitive form on the content and the title of documents To change this behaviour use flag lt modifier name to enable special query modes Modifier Name matchinurls matchauthor matchexact matchexactcase matchfuzzywords Description Query will also match URLs and links Query will also match author field Use Raw data search as default All keywords are considered as case insensitive exact character string including special characters and punctuation Use Exact raw data search as default All keywords are considered as Example flag matchinurls flag matchauthor flag matchexact flag matchexactcase case insensitive exact character string including special characters and punctuation Use Fuzzy Search as default All keywords will also match combined flag matchfuzzywords words carsharing Will match words like carsharing car sharing OF car sharing The special modifiers can be combined carsharing flag matchauthor flag matchfuzzywords searches for words like carsharing car sharing or car sharing in the fields title content and author_name Note When matchinurls OT matchauthor is set API results will not have highlighting in snippets when one of these fields is matched Talkwalker Documents Fields field name url matched_query matched_profile indexed search_indexed published title content name URL Matched Query Matched Profi
60. ing gzip must be set in the header The Encoding used is UTF 8 Evolution of JSON fields The structure of the json responses will not be changed Existing fields will not be removed and their formatting will not be changed However new fields will be added to the responses and the order of fields can change please take this into account when implementing a custom client Value options The Following tables contain possible options and formats for certain fields Source Type Options Media Source Types ONLINENEWS All news sites ME AGNE Printed magazines sites BEE Printed newspaper sites ONLIMENENS PRESAREREASES Results from sites that publish press releases ONLINENEWS_TVRADIO TV or radio stations ONLINENEWS_AGENCY News agencies ONLTNENEWS_OTHER News results that do not fall under of the other news categories FEDS All blog sites MESSAGEBOARD All forums and message boards SOCIALMEDIA All social media sites SOCIALMEDIA TWITTER Results from Twitter SOCIALMEDIA FACEBOOK Results from Facebook SOCTALMEDTA_ YOUTUBE Results from YouTube eee Results from LinkedIn SOCIALMEDIA_GOOGLEPLUS Results from Google SOCIALMEDIA FLICKR Results from Flickr SOCTALMEDTA FOURSQUARE Results from Foursquare SOCIALMEDIA_INSTAGRAM Results from Instagram Media Source Types SOCIALMEDIA MIXCLOUD SOCIALMEDIA SOUNDCLOUD SOCIALMEDIA VIMEO SOCIALMEDIA DAILYMOTION OTHER Language Options ABKHAZIAN AFAR AFRIKA
61. interest Followers ATTRIBUTES Article Continent Article Country Article Region Article City Article Longitude Article Latitude Article ID Article Type Article Name Article birth date Article Gender Article Image URL Article Short Name Article URL API y y y y y y Write access through API n Number of YouTube dislikes a video has Number of Instagram likes an image has Number of Twitter share an article has Comment Number of Facebook followers a source has Number of Twitter followers a source has Number of Instagram follows a source has Number of Pinterest follows a source has Comment Continental location of the article Country location of the article Regional location of the article City location of the article Longitudinal location of the article Latitudinal location of the article Resolution of the geo data extraction URL of the article For documents which don t include location data these fields are approximated author attributes worlddata continent worlddata country worlddata region worlddata city worlddata longitude worlddata latitude country code resolution id type name birthdate gender image url short name url ATTRIBUTES Author Continent Author Country Author Region Author City Author Longitude Author Latitude Author ID Author Type Author Name Author Birthdate Author Gender Author Image URL Author Short Name Author URL Write
62. istogram API https api talkwalker com api v1 search histogram type How it works With the Talkwalker Search Histogram API you can retrieve the distribution of the number of search results for a given search query Histograms can be made for distribution over time or over specific metrics number of comments number of shares reach retweets etc By setting min and max a histogram can be limited to a specific range min include and max include control if those bounds are included interval defines the width of the bins the accepted values are long integers for metrics or duration values like 7d for 7 days for published and search indexed dates When using a bin size of entire days timezone allows to set a timezone to specify the begin and end of the days Types type published search indexed reach engagement facebook shares facebook likes twitter retweets twitter shares twitter followers youtube views youtube likes youtube dislikes comment count Description Timestamp of publication epoch time in milliseconds Timestamp of indexation in Talkwalker epoch time in milliseconds The reach of an article post represents the number of people who were reached by this article post The engagement of an article post is the sum of actions made by others on that article post Number of Facebook shares an article has Number of Facebook likes an article has Number of Twitter retweets an article has Number of
63. iteBytes on toString wr flush wr close httpConnection connect int httpCode httpConnection getResponseCode if httpCode 200 System out println ERROR System out println IOUtils toString httpConnection getInputStream UTF 8 else I System out println CREATED protected void deleteStream throws IOException 1 String url url v1 stream s stream id delete access token token connect URL request new URL url URLConnection connection request openConnection connection setConnectTimeout 30000 connection setReadTimeout 90000 HttpURLConnection httpConnection HttpURLConnection connection httpConnection setRequestMethod DELETE httpConnection setRequestProperty User Agent JavaExampleClient 1 0 0 httpConnection setRequestProperty charset utf 8 httpConnection setDoOutput true httpConnection setDoInput true connection setUseCaches false connection setRequestProperty Content Language en US httpConnection connect int httpCode httpConnection getResponseCode if httpCode 200 System out println ERROR try System out println IOUtils toString httpConnection getInputStream UTF 8 catch Exception e 1 e printStackTrace else System out println DELETED H Throubleshooting Error Codes http code 200 500 500 400 400 400 401 401 401 403 403 403 403
64. kwalker com api v1 search info access token access token gt Parameters parameter description required default value necios token a read write token specified in the API application required Rate Limit This endpoint is limited to 10 calls per minute the result should be stored Get a list of all resources Resources are data retrieval settings from a Talkwalker project This can be search topics filters monitored pages source panels events or saved objects for for embedding in external tools To get a list of the resources defined in a Talkwalker project use the project id and the access token on the https api talkwalker com api v2 talkwalker p project id resources endpoint curl https api talkwalker com api v2 talkwalker p project id resources access token access token gt Parameters parameter description required values SEBES SES a read write token specified in the API required application type filter on the type of resources optional search filter page event panel savedobject object_type filter on types of saved objects optional name of the saved object type name of the embedding destination Example Get all saved objects from a project that were saved for embedding in an external tool called myapp curl https api talkwalker com api v2 talkwalker p project id resources access token access token type shared object amp object_type myapp Instead of using an access_token t
65. le Indexed Search Indexed Published Title Content Write access through API vi y Comment Normalized URL of the article Query which matched On streaming this information is present in extra entrydata Profile Rule which matched On streaming this information is present in extra entrydata When article was added to Talkwalker System When article was indexed by Talkwalker search system after postprocessing When article was published Text version of the source title Text version of the content Possible field values Unique Url for example http blog talkwalker com en how to export data from talkwalker Java Timestamp for example 1392821902000 Java Timestamp for example 1392821902000 Java Timestamp for example 1392821902000 field name title snippet content snippet root url domain url host url parent url lang porn level fluency level spam level sentiment source type name Title Snippet Content Snippet Root URL Domain URL Host URL Parent URL Language of the Article Pornography Level Fluency Level Spam Level Sentiment Source Type Write access through API n y y Comment If a match occurred in the title this field will contain the snippet related to the query set in the datafeed On streaming this information is present in extra entrydata If a match occurred in the article this fiel
66. mmand curl XGET https api talkwalker com api v2 talkwalker p lt projectid gt monitoring suggest input lt url string gt amp type auto amp acc ess token lt access token gt Response i status code 0 status message OK Request Taai result_monitoring_pages Eeer S qp al title ABC type facebook page access url http facebook com 296043200790 query channel V vtwqablxreaaaacgbieemqkdivbe6t2l1cicgmzlfmqnci2duorydulzpo53xonf4zdsnrqgqztembqg44taV hee Fetch query Input the access url and the site monitoring type Output query to be used in stream Command curl XGET https api talkwalker com api v2 talkwalker p project id gt monitoring fetch type twitter user amp access url http 3A 2F 2Ftwitter com 2Flufthansataccess token lt access token gt Response status code Q status message OK request GET api v2 talkwalker p project id monitoring fetch type twitter user amp access url http 3A 2F 2Ftwitter com 2Flufthansataccess token lt access token gt result monitoring pages I data I title Lufthansa type twitter user access url http twitter com lufthansa query channel V vtwqablxreaaaacgbieemqkdivbe6t21cicgmzlfmqnci2duorydulzpo53xonf4zdsnrqgqztembqg44taV il Talkwalker Query Syntax A single search query can support up to 50 operands and be up to 1024 characters long in length To create co
67. mpleted timeframe if a stream gets disconnected a non completed timeframe will not be billed When resuming a disconnected stream a partially streamed timeframe has to be restarted and streamed again When the parameter max hits is set only the specified maximum number of results will be billed even if the entire timeframe gets streamed after reaching the limit Order and Timing of Chunks It is not possible to do any custom sorting with the Talkwalker Streaming API The data is grouped in unsorted timeframes which will be returned in the order the data was added to Talkwalker This can be a different order than the order the data was published in The number of results chunks in a timeframe is not limited When implementing a client application store or process the results in a reasonable batch size to limit memory usage and prevent out of memory and do not wait for a completed timeframe Stream Results To start streaming the results from a stream at least one rule needs to be defined The results are available at https api talkwalker com api v stream s stream id results Example Start a stream curl https api talkwalker com api v2 stream s teststream results access token demo Example Resume a disconnected stream Set the parameter stream resume to the start timestamp timeframe start of the last cT coNTROL chunk Since the results in a timeframe are not sorted the streaming of the entire timeframe has to be restarte
68. mplex queries operands may be combined using Boolean Operators All queries are executed in their unaccented and case insensitive form thus a search for l VE will also match all documents with the word eleve No language stemming is being done thus a search for the children won t return results with the word child Special Transformations These transformations apply when a query contains no operators from the query syntax quotes AND OR wildcards etc see below Words with only capital letters and special chars amp are executed as exact case sensitive raw data search ABC ABC AGB A88 Screen names ename hashtags fihashtag cashtags cashtag as well as words containing a dash a plus or an ampersand 4 are executed as case insensitive raw data search username username p t p amp t If a query contains multiple simple words no special characters like ta 8 no operators and is not only capital letters it is executed as a proximity search The maximum number of jumps is set to words 1 10 cat dog mouse bird cat dog mouse bird 30 To prevent this behaviour use the explicit query syntax below instead of cat dog mouse USe cat OR dog OR mouse cat AND dog AND mouse OT cat dog mouse to search for one of the words all the words or the exact phrase Boolean Operators AND NOT OR Exclusion of Keywords Phrase Search Combinations Wildcard Sear
69. n the last 30 days In addition a histogram of the number of results can also be returned You can sort the results by publication time indexing time engagement or other metrics A single search query can support up to 50 operands To create complex queries operands may be combined using Boolean operators A few words about the results Search results can be sorted by engagement time or other metrics and be restricted to specific attribute value ranges for example only return results published in a certain timerange When no special filters are applied a single search request will return results from all media types and all languages over the past 30 days sorted by engagement by default You don t need to execute one search request for each language and media type separately To get a smaller set of results you can either get only the highest ranked results or get a random sample set A brief example Search The Talkwalker API search results endpoint https api talkwalker com api v1 search results is used to search on the Talkwalker API For testing purpose the access token demo can be used Setting the variable pretty true will return formatted results command curl https api talkwalker com api v1 search results access_token demo amp q cats amp pretty true response all responses are UTF 8 1 status code 0 status message OK request GET api v1 search results access_token demo amp q cats amp pretty true
70. nfig Time te Example class can be used as an example Tt is invoked via the ExampleTest class in this test case i public class TalkwalkerApiStreamingClientExample 1 private final String url private final String token private final String stream id private final long start ts private final long stop ts public TalkwalkerApiStreamingClientExample String url String token String stream id long start ts long stop ts 1 bris sU T DE Mrle this token token this stream id stream id this start ts start ts this stop ts stop ts public void run throws InterruptedException IOException 1 deleteStream System out println CREATING STREAM createStream AtomicLong resume ts new AtomicLong start ts boolean finished false while finished try String url url v2 stream s stream id results access_token resume ts get amp stream_stop stop_ts connect URL request new URL ur1 URLConnection connection request openConnection connection setConnectT imeout 30000 connection setReadTimeout 90000 HttpURLConnection httpConnection HttpURLConnection connection httpConnection setRequestMethod GET httpConnection setRequestProperty User Agent JavaExampleClient 1 0 0 httpConnection setRequestProperty Accept Encoding gzip connection setUseCaches false connection setRequestProperty Content Language en US httpConnection connect
71. ntaining the word birds Set the query to birds 2 sourcetype ONLINENENS By default the Talkwalker Search Histogram API return results over the last seven days curl https api talkwalker com api v1 search histogram published access token demo amp g birds 2 sourcetype V ONLINEN EWS amp interval day amp pretty true response status code Q status message OK request GET api v1 search histogram access_token demo amp q birds 20sourcetype ONLINENEWS amp interval d amp pretty true result histogram header y Number Results m i datae NEC t 1417478400000 v 4366 0 Jo t 1417564800000 ver 3385 107 ood t 1417651200000 EE Ja 3 t 1417737600000 v 4071 0 Jo t 1417824000000 wiet Ss 1 Ja t 1417910400000 v 2191 0 jou t 1417996800000 WP ES Ja 3 t 1418083200000 TN 4020 a Get a histogram with a resolution of 6 hours over the last 7 days of results containing the word birds Set interval to 6h for 4 values per day curl https api talkwalker com api v1 search histogram published access token demo q birds amp interval 6h pretty tru e The interval parameter accepts the values year quarter month week day hour minute second as well as numeric values with the units w week d day n hours m minutes and s seconds Get a histogram over a specific range Set nin to
72. on getContentEncoding null amp amp httpConnection getContentEncoding equals gzip is new GZIPInputStream is read json using jackson json another library may be used here JsonFactory factory new JsonFactory ObjectMapper mapper new ObjectMapper factory TypeReference lt HashMap lt String Object gt gt typeRef new TypeReference lt HashMap lt String Object gt gt y HashMap lt String Object o mapper readValue is typeRef private void readStream HttpURLConnection httpConnection InputStream inputStream AtomicLong resumeTs throws IOException 1 reading the stream and invoking the listener InputStream is inputStream if httpConnection getContentEncoding null amp amp httpConnection getContentEncoding equals gzip is new GZIPInputStream is BufferedReader reader new BufferedReader new InputStreamReader is UTF 8 100 String line while line reader readLine null parse json use an available json parser skip empty lines if line isEmpty I continue JsonFactory factory new JsonFactory ObjectMapper mapper new ObjectMapper factory TypeReference lt HashMap lt String Object gt gt typeRef new TypeReference lt HashMap lt String Object gt gt I y HashMap lt String Object o mapper readValue line typeRef Object oType o get chunk type if oType null 88 oType instanceof String String type String oT
73. or questions is question will find only documents that are questions PE Image Search contains image returns those documents that include images ll Audio Search contains audio returns those documents that include audio contains audi Video Search contains video returns those documents that include videos contains videa Talkwalker Tags Search Score Search Post Type Search is important finds all documents that were manually tagged as important in Talkwalker is read finds documents that were read original document link opened is checked finds documents were the sentiment has been checked manually in the project score n finds all documents that were manually tagged with the respective score In a Talkwalker project scores can be added to a selected document by pressing the number keys posttype IMAGE allows to search only for documents of type image Possible values are TEXT LINK IMAGE VIDEO AUDIO Url based Search Url Search Parent Url Search Host Url Restriction Domain Url Restriction Site Search In Urls Search url returns the document with this exact url Prefix Wildcard e g apple matching is not supported parenturl returns all child documents comments or retweets from a document specifed by the given url E g Give me all the comments for this document url hosturl www spiegel de returns all the documents from the host www spiegel de domainurl spiegel de returns all the documents
74. p topic page How to eliminate retweets or comments from a stream To remove retweets and retrieve only the original Tweets add is retweet Or is comment to the rules of a stream If you want to remove all retweets from an entire stream you can also add a query is retweet when getting the results of a stream curl https api talkwalker com api v2 stream s test p lt project_id gt results access_token lt access_token gt amp q is retweet How to get only documents of a Talkwalker project that include special keywords To get a stream of only a subset of the documents of a Talkwaker project you can set up rules for your stream Rules are expressed in the Talkwaker query syntax https api talkwalker com api v2 stream s stream id gt r lt rule id is used to set new rules for an existing stream If you define more than one rule the stream will return any documents that match at least one rule curl XPUT https api talkwalker com api v2 stream s teststream r rule 1 access_token demo d query keyword1 AND keyword2 H H Content Type application json charset UTF 8 The stream will now only return documents that match keyword AND keyword2 the field data highlighted data matched rule id indicates which rules were matched How to use a single stream for multiple applications clients To use one stream to retrieve data for more than one application client rules are used Set a separate rule using the Talkw
75. project with the Talkwalker API Topics can be used with the Search Results API and the Search Histogram API This allows Talkwalker users to use the queries from their projects and to retrieve the documents they get in their Talkwalker project including changes and tags that were done in Talkwalker In addition to the 30 days of search the full history of Talkwalker projects is available in the search API when used in combination with a Talkwalker project Parameters parameter description required default value Neier API access token required q The query to search for required Se Number of results to skip for paging optional default 0 hpp Number of hits per page for paging optional default 10 maximum 500 Sort Dy Criteria for sorting the results optional default engagement eee Sorting order ascending or descending optional default desc hl Turns highlighting on or off optional default true topne One or more topics or panels that are defined in the optional multiple Talkwalker project Credits 1 credit per returned result minimum 10 credits per Search Result API call 10 credits per Search Histogram API call No credits for project list topic list document update and document delete calls Get a list of all projects linked to an API application Use the private access token from your API application on the https api talkwalker com api v1 search info endpoint to get the list of all linked projects curl https api tal
76. proximity search operator but also works with parentheses and thus can be used with multiple terms default value for x 15 Same as NEAR x but respects the order of terms BMW AND bike BMW AND NOT bike BMW AND NOT bike BMW OR bike NOT coupons bmw series BMW AND motorcycle OR car Luxemb reali ation obama merkel 5 roam 1 carsharing L or al L Or al BMW OR Audi NEAR 3 motorcycle OR car BMW OR Audi ONEAR 3 motorcycle OR car BMW OR Audi SENTENCE Sentence Search The SENTENCE operator works similar to the NEAR x operator It searches for notoreyete DR ear keywords that appear in the same sentence SENTENCE can also be used with multiple terms BMW OR Audi Ordered Sentence Same as SENTENCE but respects the order of terms in the sentence OSENTENCE motorcycle Search OR car Note In phrase search and raw data phrase search or the number and type white space characters are ignored For example BMW series one space will also match documents which contain BMW series two spaces and vice versa White space characters include spaces tabs and new line characters also transitions between letters and special characters are considered as whitespace For example psT will match pat but also p r and P amp T Advanced Search Options Single Keyword Search for simple brands products keywords etc Apple Search Title Search It searches within t
77. results which have been found since the time you executed below command curl https api talkwalker com api v2 stream s test p project id results access token access token This will stream the data to your application For each entry or for every second if there are no entries our server will send you a newline Below is an example of the data you will receive chunk type CT CONTROL chunk control timeframe start 1409906205401 timeframe end 1409906265618 chunk type CT RESULT chunk result data Edad url http www facebook com permalink php id 450129291348story fbid 10152329058194135 matched profile hznwvi3k_5imn wzqr36f Jk indexed 1409906120127 search indexed 1409906245484 published 1409902879000 Al 2 4 content Cn u hlp me abt my dstv account title snippet content snippet Cn u hlp me abt my b dstv b account root url http www facebook com 45012929134 domain url http facebook com host url http www facebook com parent url http www facebook com permalink php id 45012929134 amp story_fbid 10152329058194135 lang en porn level 0 fluency level 100 spam level 0 sentiment 0 source type SOCIALMEDIA SOCIALMEDIA FACEBOOK I post type ed I article extended attributes I num comments 1 d source extended attributes al
78. s To resume a disconnected stream set the parameter stream resume to the start timestamp timeframe start of the last cr control chunk Since the results in a timeframe are not sorted the streaming of the entire timeframe has to be restarted to make sure that no documents are lost curl https api talkwalker com api v2 stream s teststream results access token demo amp stream resume 1388534400000 The Streaming API returns different results for the same topic than the Talkwalker application Possible reasons Different queries or source filters Use https api talkwalker com api v2 stream s stream id access token demo amp pretty true to make sure that no additional rules and source blacklists are set Documents are streamed at indexation time Talkwalker finds most documents briefly after they were created at this moment they are added to Talkwalker and streamed via the API Documents that are found later ie some time after they were published on the original webpage will be added to Talkwalker with their original publication time timestamp the published field along with the documents that were found earlier In the Streaming API they only appear at the moment they were found timestamp in search_indexed field Solutions with a query on published published gt 1388534400000 AND published 1388544400000 a stream with a start point stream resume of the beginning of the time range and a stop point stream stop
79. s are aligned with the different reset times Additional Information on Quota in Control Chunks The information delivered through the control chunk contains the list of streams requested by the connection It contains the number of results delivered per stream the remaining quota if applicable the status of the stream if it has been deactivated because of the quota The number of remaining credits on the account can be requested through the credits API Control chunks will have the following additional information timeframe start 1427216400000 timeframe end 1427216460000 stream id stream 1 allowance 10000 usage 5000 reset 1427241600000 status active i Temporarily Disable Streams POST https api talkwalker com api v2 stream s stream id enable POST https api talkwalker com api v2 stream s stream id disable These endpoints allow to temporarily disable a stream or to eanble it Disabling a stream has the same effect as a stream which has reached its quota Disabled streams are shown in control chunks with status disabled New created streams are enabled while creating you can explicitly specify enabled true OT enabled false Talkwalker Streaming API and Talkwalker Projects https api talkwalker com api v2 stream s stream id p project id results How it works Talkwalker users can use the topics defined in their project with the Talkwalker API Topics c
80. s doc1 html title This is a title content Example content Really not that much tags marking read published 1430136532000 H H Content Type application json charset UTF 8 Update Setting a new title field adding an important tag and removing the read tag curl XPOST https api talkwalker com api v2 docs p project id update access token access token d url http www example com docs doc1 html title This is a new title content Example content Really not that much tags marking important tags marking read extra author attributes I name null published 1430136532000 H H Content Type application json charset UTF 8 Fields that are of type array can be updated in three ways using lt fieldname gt to replace the whole array lt fieldname gt to add an item to the array and lt fieldname gt to remove an item Fields can be cleared by explicitly setting them null Delete Deleting a document curl XPOST https api talkwalker com api v2 docs p project id delete access token access token d url http www example com docs doc1 html H H Content Type application json charset UTF 8 Undelete Deleting a document curl XPOST https api talkwalker com api v2 docs p project id undelete access token access token d 1 url http www example com docs doc1 html H H Conten
81. set json gt chunk type 1 switch json chunk type case CT ERROR this handleStreamError json chunk error break case CT CONTROL if isset json chunk control gt timeframe start this resume ts json chunk control gt timeframe start this handleStreamControl json chunk control break case CT RESULT this handleStreamResult json chunk result break default this gt unhandledStreamChunk json break else this gt unhandledStreamChunk json break else this unprocessed data elseif http_status 503 header array this gt parseHeader partial header if array key exists Retry After header array this wait for retry header array Retry After else this error data this gt error_data data return strlen data function onStatusError str echo START ERROR n str n function handleParseError str echo Could not parse str n function handleStreamError err echo ERROR n var_dump err function handleStreamControl ctrl I echo CONTROL ctrl timeframe start TO ctrl gt timeframe_end n function handleStreamResult res I if isset res gt data gt data gt url I echo RESULT res gt data gt data gt url n function unhandledStreamChunk json echo UNHANDLED n var dump json function parseHeader header headers array
82. share an article has Number of Facebook likes an article has Number of Twitter retweets an article has Number of Twitter share an article has facebook shares 0 facebook likes gt 0 twitter retweets 1000 twitter shares 0 metric name twitter followers youtube views youtube likes youtube dislikes instagram likes instagram followers comment count published searchindexed sample sample million sentiment Description Number of Twitter followers a source has Number of YouTube views a video has Number of YouTube likes a video has Number of YouTube dislikes a video has Number of Instagram likes a post has Number of Instagram followers a post has Number of Comments an article has Timestamp of publication epoch time in milliseconds Timestamp of indexation in Talkwalker epoch time in milliseconds Get a random sample of the results percent of the total number of results i e setting 25 will return one of four the documents values 1 100 Similar to sample percent with higher precision i e setting 2000 will return one of 500 documents values 1 1000000 The detected sentiment of the article values 5 negative to 5 prositive sentiment positive sentiment negative and sentiment neutral map to the respective sentiment ranges of Talkwalker Geographic Restrictions Example twitter followers 1000 youtube views 100000 youtube likes gt 100 youtube dislikes gt 0 instagram likes gt 0 inst
83. single sign on URL for a Talkwalker account or application To get such an URL the loginurl endpoint is used the returned login URL is only valid for 10 seconds The alternative endpoint api talkwalker com api v2 auth loginurl access token access token can be used to login without specifying a user the returned login url will authenticate as the account administrator Parameters parameter description required default value Access token Authentication access token required proj ct M ID of a Talkwalker project required page Menu page that will be opened on login optional home_screen vien View that will be shown on login optional home screen logout url Url the user will be redirected to on logout optional default login page parameter description required default value token timeout Timeout for the generated login token optional 10s pony Formatted json for testing optional false token timeout accepts values in minutes or seconds for example 5s or 1 with a maximum time of 30n Either page can be set monitor dashboard OY home screen to lead the user to a generic menu or view can be set to lead to a specific stored view To get a list of all views see below Example https api talkwalker com api v2 auth u lt user id loginurl access token access token gt amp pretty true Result status code 0 status message OK request GET api v2 auth u user id loginurl access token access token gt pr
84. specially make rings and pendants but you will find some pins and earrings as well All my pieces are one of a kind so no two pieces are the same I love traveling and much of my work reflects the memories of places I love I also like to bring back from my trips beautiful and unique glass and ceramic beads and cabochons and found pieces such as ceramic shards and beach pottery to incorporate in my work or use as focal pieces In recent years title snippet Color and Light Inspirations in Jewelry SUNNY RINGS root url http annukcreations blogspot com domain url http blogspot com host url http annukcreations blogspot com parent url http annukcreations blogspot com 2014 12 sunny rings html lang en porn level 0 fluency level 90 spam level 20 sentiment 5 source type BLOG BLOG OTHER apost types EAT tokens_title and Light Inspirations Light Inspirations Light Inspirations SUNNY RINGS SUNNY RINGS and Light Inspirations Inspirations RINGS RINGS Light Light Jewelry Jewelry Color Color SUNNY SUNNY tokens content Bead Hoarder Blog Bead Hoarder Blog tokens mention yahoo tags internal isQuestion article extended attributes num comments 3 Yo source extended attributes alexa pageviews 0 bs extra article attributes world data Je
85. ss token gt amp type search The result could look like this 1 status code Q status message OK request GET api v2 talkwalker p lt project_id gt resources access_token lt access_token gt amp type search result resources projects I id lt project_id gt title Air France topics 4 desea rch ds title Category 1 nodes I id search 1 1 eelan a nae les do id search 1 2 Sul e topic IJ Dg id search 2 title Catergory 2 nodes I id search 2 1 Feeley 8 to pac le No Y id search 2 2 ES EOD 27 bp t id search 2 2 states E topic 3 i bd To get results for all projects in search use search as topic ID To use a single topic use the id of the topic for example search 2 1 for topic 1 of category 2 in search Code Examples Streaming Client Examples PHP Note This example needs the php cURL library and PHP 5 5 client php lt php class TalkwalkerApiStreamingClientExample private url private token internal private finished FALSE private resume_ts private unprocessed data private header size 1 private header private header complete FALSE private wait for retry 0 private error data public function __construct url token this gt url url this gt token token function setCurlOptions ch I curl setopt ch CURLOPT CON
86. t Type application json charset UTF 8 Multiple Documents Multiple documents can be manipulated using the https api talkwalker com api v2 search p project id endpoint The execution order of the given document operations is not guaranteed multiple operations on a single document in a single request should be avoided curl XPOST https api talkwalker com api v2 docs p project id delete access token access token d 1 create url http www example com docs doc1 html title This is the title of doc 1 content and this is the content of doc 1 Po A update url http www example com docs doc2 html title This 1s the title of doc 2 content and this is the content of doc 2 bs delete HI H Content Type application json charset UTF 8 url http www example com docs doc3 html Parameters parameter description required values access_token a read write token specified in the API application required parameter description required values return entry Specifies if the modified document should be returned optional hide default show See Talkwalker Documents Talkwalker Streaming API Source https api talkwalker com api v2 stream How it works The Talkwalker Streaming API delivers real time data through a persistent connection to our servers Configure your stream with a set of filtering rules connect to the stream and new res
87. tream its rules queries and panels are represented by the following json object stream id rule id and panel id are used to reference streams rules and panels and have to be unique within a project stream id and rule id are also used in the results to specify which rule or stream matched a result Example stream id teststream rules I rule id rule 1 query cats eost rule id teststream dogs toppanel query dogs panel referenced panel toppanel H Stream ids rule ids panel ids etc can only contain lowercase letters numbers and the characters and They have to start with a lower case letter json fields parameter description required default value SEES td id we want to reference this stream with required putes a set of rules for this stream optional A set of rules can be either an array of strings to be matched or for a more advance usage a rule is defined as the following object parameter description required default value rule id id we want to reference this rule with will also optional be returned when the rule matched WE a query defining this rule optional ee a set of panels that are being applied to this rule optional panel matching matching can be all or any if doc needs tobe optional any in all panels or in a single panel Note either a query or a panel must be set The Talkwalker API returns a sequence of chunks in version 2 v2 stream the
88. ults will be delivered in real time as soon as they are found by our crawlers You will not need to do any polling to receive new data You setup and configure the Streaming API by defining rules Boolean query language media types etc The Streaming API then finds and collects all relevant data and adds it to your data stream with individually highlighted snippets per matched rule This feature allows you to gather data from many rules through a single stream while easily matching the results back to your predefined rules Each rule allows filtering by title content author language URL country media type and more parameters using the same syntax as in our Talkwalker Search interface You can also apply a list of sources to be included or excluded from the stream to give you even further possibilities to narrow down the results you will get A single rule can support up to 50 operands To create complex rules operands may be combined using Boolean Operators The documents are streamed in the order they are found by our crawlers and added to Talkwalker i e by search indexed timestamp Custom sorting is not possible with the Streaming API however this can be done with the Search API The documents are grouped in timeframes which contain all documents that were indexed between the given start and end time of the timeframe Each result independent on how many rules match will be counted as 1 credit Stream Format Stream A S
89. within the last 30 days In addition a histogram of the number of results can also be returned You can sort the results by publication time indexing time engagement or other metrics A single search query can support up to 50 operands To create complex queries operands may be combined using Boolean operators A few words about the results Search results can be sorted by engagement time or other metrics and be restricted to specific attribute value ranges for example only return results published in a certain timerange When no special filters are applied a single search request will return results from all media types and all languages over the past 30 days sorted by engagement by default You don t need to execute one search request for each language and media type separately To get a smaller set of results you can either get only the highest ranked results or get a random sample set Parameters parameter description required default value access_token API access token required q The query to search for required See Number of results to skip for paging optional default 0 hpp Number of hits per page for paging optional default 10 maximum 500 sort by Criteria for sorting the results optional default engagement SSES Sorting order ascending or descending optional default desc ni Turns highlighting on or off optional default 1 pretty Formatted json for testing optional false More on the Talkwalker Query Syntax
90. ype switch type 1 case CT ERROR Map lt String Object errorChunk getAsMap o chunk error handleStreamError errorChunk break case CT CONTROL Map lt String Object controlChunk getAsMap o chunk control if controlChunk null Long timeframeStart getAsT controlChunk timeframe start Long class if timeframeStart null I resumeTs set timeframeStart j handleStreamControl controlChunk break case CT RESULT Map lt String Object resultChunk getAsMap o chunk result handleStreamResult resultChunk break default unhandledStreamChunk o break else I unhandledStreamChunk o protected static Map String Object getAsMap Map String Object o String key 1 iO le quo Map lt String Object gt ret null Object oRet o get key if oRet null amp amp oRet instanceof Map return Map lt String Object gt oRet return null protected static T T getAsT Map lt String Object o String key Class lt T gt clz af Co l null f Map lt String Object gt ret null Object oRet o get key if oRet null amp amp clz isInstance oRet I return T oRet return null protected void onInitializationError Map lt String Object gt errorData System out println ERROR errorData protected void handleStreamError Map lt String Object gt errorChunk System out println ERROR errorChunk protected void handleStr
Download Pdf Manuals
Related Search
Related Contents
取扱説明書 - ネオンフレックスジャパン公式ホームページ Neon Flex Japan 主な内容 - さつま町消防本部 Untitled - CBC Group Toshiba ASD-MULTICOM User`s Manual Italiano - PTS Diagnostics 24 - 26 ocT. - Maison de la Danse Philips BTD2339 Funksender mit Uhr FS GE-2525LPX User Manual Copyright © All rights reserved.
Failed to retrieve file