I recently saw this ArXiv twitter bot, and I thought it was pretty cool. There are hundreds of new papers appearing on ArXiv every day that can be hundreds of pages long, which it makes it almost impossible to keep up with all the latest research.
My co-founder Ivan and I are building the RCS API for developers, and as part of that, we started doing Monday hackathons, and so I thought it'd be cool to hack together something like the ArXiv twitter bot but for text message updates about trending AI papers.
Here's a guide on how I did it
Getting & sending the data
Opting in / out
Deploying
ArXiv (pronounced "Archive") is an open-access repository of papers, mainly in STEM fields, and allows researches to release their work before it's published in traditional journals. A ton of amazing AI research ends up coming through ArXiv / getting cross published to ArXiv, so I thought this would be a good source to learn from.
To get the papers, I wrote a simple script using ArXiv's RSS feeds:
def get_arxiv_papers(category='cs.ai', since=None) -> List[ArxivPaper]:
"""
along with defining a ArxivPaper
object
@dataclass
class ArxivPaper:
There's an issue though. Even though we have all the papers, we don't have a great way to track how these papers are being recieved.
For that, we'll use Mendeley's API. The reason we're using Mendeley's API is because saves by academics (just called views in our case) tend to be a decent predictor of paper success.1 You can get an API key from Mendeley here. From there, just create an application and get the CLIENT_SECRET
and CLIENT_ID
("ID" in the console / the generate secret w/ the Elsevier flow).
# Initiate the session
def get_mendeley_session():
From there, I just upload the papers to a Supabase database (although you could use whatever database you like).
def save_papers_to_supabase(papers):
"""
Now that we have the papers and their saves to rank them, we can send out the 3 most popular papers that were released that day. Ideally, we also track Mendeley saves over time so we can track what's trending aside from its initial launch but we're doing this in a couple hours so that's for the future.
To do this, we're going to write a simple function with Pinnacle's RCS API
def sendPapers(to: str, papers: List[ArxivPaper]):
cards = []
The buttons are particular to each paper sent, but the quick replies are shared by our carousel.
When we deploy, we can deploy a cron job that runs the ArXiv RSS feed checker (since it only refreshes daily) and can then send out the trending papers as well.
Now that we have the backend flow, we need to allow people to signup.
Before we message people, we need to get consent. For this, I setup a simple server action flow with an API endpoint with Next.js and Vercel that listens for tapbacks from the user to opt in (e.g., the opt out button on the text message thread in the former picture).
import { NextResponse } from "next/server";
import {
You can see the full actions code here.
I also a basic frontend UI for it where users can opt in.
After signing up, they recieve a message like this:
Now we also need a way for them to opt out.
To do this, we've already sent the payload for opting out and now just need to listen to it:
export async function POST(request: Request) {
console.log("Received post");
To resubscribe, it's a similar flow (also present in the former code).
To deploy our backend for this, I used Porter.run's proter.yaml file. It was pretty simple to setup and you can find docs on it here.
version: v2
name: arxiv
That's all it took to get the backend setup! To deploy the frontend code, I just used vercel.
To deploy the frontend, I just deployed to vercel through their CLI and provisioned a subdomain arxiv.trypinnacle.app
Sadly, all things come to an end eventually, and this is the end of post numero uno, but you start a new beginning! If you want to try using our API for RCS, you can sign up here!
The most "successful" papers aren't always the most reputable cough cough room temp superconductor paper cough cough, and you can watch this to see why.
I think it would be cool if we could track the Mendeley views over time / get summaries using an LLM on the sometimes hundreds of pages long research papers.
(You can also contact us at founders@trypinnacle if you have any questions)