Building a successful Apify Actor is part engineering, part product thinking. The scraping logic is table stakes — what makes an Actor successful in the Store is discoverability, usability, and reliability.
Before writing code, validate demand:
my-actor/
├── .actor/
│ ├── actor.json # Actor metadata + store listing
│ └── input_schema.json # Input configuration UI
├── src/
│ └── main.js # Entry point
├── Dockerfile # Runtime environment
├── package.json
└── README.md # Store description
{
"actorSpecification": 1,
"name": "medium-article-scraper",
"title": "Medium Article Scraper",
"description": "Scrape articles from Medium.com by URL, author, or topic",
"version": "1.0",
"buildTag": "latest",
"environmentVariables": {},
"dockerfile": "./Dockerfile",
"input": "./input_schema.json",
"storages": {
"dataset": {
"actorSpecification": 1,
"title": "Scraped articles",
"description": "Dataset of scraped Medium articles"
}
},
"isPublic": true,
"seoTitle": "Medium Article Scraper - Extract Articles, Authors & Stats",
"seoDescription": "Scrape Medium articles by URL or author. Extract title, content, claps, responses, and reading time. Export to JSON, CSV, or Excel."
}
{
"title": "Medium Scraper Input",
"type": "object",
"schemaVersion": 1,
"properties": {
"urls": {
"title": "Article URLs",
"type": "array",
"description": "List of Medium article URLs to scrape",
"editor": "stringList",
"items": {"type": "string"}
},
"maxArticles": {
"title": "Max Articles",
"type": "integer",
"description": "Maximum number of articles to scrape",
"default": 10,
"minimum": 1,
"maximum": 1000
}
},
"required": ["urls"]
}
# IMPORTANT: Use Node 20+ for modern npm packages
FROM apify/actor-node:20
COPY package*.json ./
RUN npm ci --omit=dev --omit=optional
COPY . ./
CMD npm start
⚠️ A common pitfall: using actor-node:18 with packages that need Node 20+. This causes cryptic runtime errors (like File is not defined). Always check your dependency requirements.