We’re looking to extract structured data about company fund raising events (when startups get investments) and acquisitions (when one company buys another company).
We have a list of RSS web sources. Some of the articles contain this type of information, others are about different subjects. We should be able to distinguish between non-relevant articles, articles about acquisitions, and articles about fund raising. After that, we need to extract the following fields from the articles.
Data fields to extract for funding round:
Company name
Company website
Company description
Company country
Company HQ city
Industries
Launch year
Funding round type
Funding round amount
Funding round currency
Funding round date
Investor names
Data fields to extract for exit / acquisition:
Target company name
Target company country
Target company hq city
Exit type (acquisition, lbo)
Transaction date
Acquiring company name
Acquiring company country
Total EV amount
Total EV amount currency
Language: English only
Data sources
• https://www.techmeme.com/feed.xml
• https://techcrunch.com/feed/
• https://thenextweb.com/feed/
• https://siliconcanals.com/feed/
• https://www.geektime.com/rss/
• Google News alert with term: “funding” AND “raises”
Posted On: August 26, 2020 14:23 UTC Category: Data Extraction