Hacker Newsnew | ms word | google docs | vscode | txt | html | submitremoteok (3662) | logout
Paperpile: Backend Engineer (paperpile.com)
421 points by paperpile 21 days ago | flag | hide | past | favorite | 100 comments





Paperpile runs on data at scale, with a literature database of 250M+ academic papers and a growing body of user data accumulated over more than a decade. You'll work across the systems that ingest, process, store, and serve this data reliably: building pipelines, optimizing search, handling PDFs at scale, and exposing clean APIs.

Requirements

  • Strong backend engineering background with experience building and operating data-heavy systems in production.
  • Experience deploying and operating services on AWS.
  • Experience designing and maintaining data ingestion pipelines handling messy, heterogeneous sources. Comfortable with web scraping and working with third-party data sources and APIs.
  • Familiarity with Node.js and TypeScript. It’s fine if you come from a different background, such as Java or Python, but you should be comfortable working in this environment.
  • High standards for data quality. You think carefully about correctness, deduplication, and consistency.
  • Solid understanding of full-text search systems including indexing strategy, relevance tuning, and query optimization.
  • Proficient in building reliable REST APIs.

More useful experience

  • Familiarity with academic publishing formats and data sources (PubMed, Crossref, arXiv…)
  • Experience with PDF processing pipelines (extraction, transformation, storage and delivery at scale).
  • Experience with LLM-based document processing or ML pipelines for extracting structured data from unstructured text.
  • Large scale web crawling and scraping.

Compensation

  • Base compensation €60,000–€90,000 based on the level of your experience
  • Bonus/equity program.



Nice

I don't see why anyone would pay for what this startup does. You can easily build this yourself in 2 hours but serverless by double-dipping NextJS, Tailwind, Jest, Gatsby, Enzyme and Webpack on top of Kubernetes in a Docker VM on AWS.

Guidelines | FAQ | Lists | API | Security | Legal | This is a parody | Go back to HN

Search: