TheNews
I wrote a Chrome Extension that lets you read a breaking / live news headline every time you open a new tab. Here it is. I wrote it because I often lose track of what’s happening in the world: some days while participating in adult conversation, I speak exclusively in “Huh?”s.
I’d previously released FrontPageNYT which did something similar, except that it sourced news exclusively from the New York Times. While NYT is my preferred news establishment for longer-form articles / opinion pieces, I soon realized that it wasn’t the kind of news source suited for quick-5-second-perusal-of-the-latest-headlines, which is what I wanted for an extension like this. With a feeling of dread, I realized I would have to cast my headline-sourcing net wider. Dread, because I did not look forward to scraping news websites. Even if they provided APIs, like the New York Times does, creating and managing multiple developer accounts and talking to multiple APIs - each a special snowflake with its own idiosyncratic request and response format - sounded like the opposite of fun. Fortunately, because of the existence of giants with excellent load-bearing shoulders, I didn’t have to.
I discovered https://newsapi.org/. By signing up for a developer account, I was able get live headlines from 70+ news sources / blogs. And at the nonpareil price of FREE, it fit in my budget.
The major issue was that the API does not support querying for headlines by category (business, tech etc). The two things the API let me do were:
- Query all the news sources corresponding to a category, and
- Query all the stories corresponding to a news source.
Every news source is hardcoded to have one category (Ars Technica is ‘technology’, ESPN is ‘sports’, New York Times is, well, ‘general’). Given that what I actually wanted to do was show users news in categories they were interested in, I had to work around all this. I came up with a cute program flow, so I thought I’d share.
I call it the Fetch → Decode → Display flow, inspired by the microprocessor instruction execution cycle (it’s actually more Fetch → Fetch → Decode → Display, but the former sounds snappier).
Fetch # 1
First, I let the user check-mark what categories of news they’re interested in the extension’s Options. When a new tab opens, I load the user’s saved options, build a list of categories they’re interested in, then issue a request for each one:
I map over all the categories, issuing a Promise
for each one (in fetch()
function) and then using Promise.all
to join on all of them (list of Promises
→ Promise
of a list). The output of this function is a Promise
-wrapped list of list of news sources. For example, if the categories chosen by user were ['technology', 'business']
, this might get resolved to [['ars-technica', 'hacker-news'], ['wall-street-journal', 'economic-times']]
.
Fetch #2
Once I have a list of news sources that are of interest to the user, I issue a request to get the latest stories for each source in a similar fashion:
I flatten the list of list of sources because at this point I don’t care about separating by category; all of these sources will be used to get stories:
[['ars-technica', 'hacker-news'], ['wall-street-journal', 'economic-times']]
→ ['ars-technica', 'hacker-news', 'wall-street-journal', 'economic-times']
.
Then, I map over this list and get stories corresponding to each.
Decode and Display
This list of stories then gets fed into a decode function, and finally a display function. As you can imagine, in the decode()
function I process and clean up the returned stories, and in the display()
function I populate the HTML. Because of this design, I was able to reuse the code I wrote for decode()
and display()
for the FrontPageNYT extension - only the fetching code had to change. For details you can check out the code, but the point is that (I think) the code is cute; in the end, this is what I call:
Voila! I cache results (with a user-configurable timeout) in local storage
so that this whole process does not have to happen every time a new tab is opened. The true
you see as an argument for the display()
function here means ‘update cache’. The argument supplied is false
in other invocations of display()
when we don’t want to update the cache; for example, in the cases where the cache is still hot, or if there was a network failure and we can’t fetch new stories.