Data acquisition

250.0 EUR

250.0 EUR peopleperhour 技术与编程 海外
118天前

详细信息

We are a software company. For one of our projects we need to downloadinformation from a website containing articles about medical topics.The website contains cca. 10000 HTML pages of paged listing of articlesin Czech language. The list contains titles of articles, each title havinga link to the detail HTML page with the article text.We need someone to produce wget and other scripts and download the titles ofall articles, parse the links from those titles, download the detailed pagesof the articles and distill the text that is shown in the page. The titles as well as the detail pages mostly have the same structure sothis allows for an automated work. But it is not so in 100% cases, there maybe several types of structure so it may require some attention as to howto distill the correct information.The result of this work will be a set of static HTML files. You can view thisstructure underhttps://fomenot.com/z/dwld24/main.htmlI.e. the result will contain the contents of the article separated intoparagraphs of normal text and captions (nothing else, no images or othertexts). We only want the main text of the article that is visible on the screenfor the user. No other text or html content.Another result will be the raw HTML output for each of the detail pagesFor accepting the output, we will do our check of the result. If we find errors,we will give examples of these errors and we will expect the vendor to fixall such errors in the result, not just those examples. If there are only a fewerrors we may not be able to find them and it is ok. But if we find any we willrequire correcting them.We expect that the raw HTML files will be 100% error free (for these we will notgive examples, we just would demand fixing them). For the text-based resultswe will give examples before demanding to fix them.
An example of such a source page you can find here: https://www.idnes.cz/onadnes/zdravi/2You can see a list of articles, each having a link leading to the detailand then a paging control that can load more articles from the next page.This is NOT the page we need to download but similar. Putting here the exampleonly that you understand what is the task.
Let us know if you could do it and for what price. We will provide the real linksto the selected candidate.

免责声明

该外包需求信息来源于站外平台,本站仅提供公开信息部分字段展示与订阅服务,更多请查看免责声明

关注公众号,不定期副业成功案例分享
关注公众号

不定期副业成功案例分享

领先一步获取最新的外包任务吗?

立即订阅

类似推荐

Description: Playtesters Needed for Tactical Strategy Game, Title: Playtesters Needed for Tactical Strategy Game (Windows Only, Fixed Pay) Help Us Polish an Upcoming Tactics Game! We’re seeking casual and hardcore gamers to playtest bite-sized segments (20–40 minutes) of our turn-based strategy project! Share your thoughts on sound, UI, balance, and bugs for fixed compensation per session. About the Project: Command Unique Heroes: Lead diverse commanders, each with distinct abilities and game-changing powers. Epic Single-Player Campaign: Test 30+ missions across dynamic environments (forests, deserts, icy tundras, and toxic wastelands). Multiplayer Mayhem: Battle friends online or locally. Massive Unit Variety: Strategize with 30+ land, air, and sea units—tanks, jets, submarines, and specialist infantry like snipers and saboteurs. Custom Maps: Experiment with a built-in mission editor. Your Role: Play 20–40-minute sessions (campaign, multiplayer, or custom maps). Report feedback on: Sound Quality: Music, SFX, and overall immersion. UI/UX: Clarity, menu navigation, and responsiveness. Bugs/Glitches: Technical hiccups or visual oddities. Balance Issues: Overpowered units, skills, or progression friction. Requirements: No prior experience needed—just enthusiasm for strategy/tactics games! Basic ability to articulate what feels "off" or "awesome" in gameplay. Windows PC (64-bit): Minimum specs include an Intel i3 (6th Gen) or equivalent, Atleast 4GB RAM, and an OpenGL 3.3-compatible GPU. Compensation: Fixed payment per playtest session (simple and hassle-free!). Let’s make strategy games unforgettable—one playtest at a time! (Indie team, no NDAs, just a shared passion for crisp gameplay and killer soundtracks!) Note: Currently available for Windows only. Linux/SteamOS support may come later!
517.0 GBP 技术与编程 peopleperhour 海外
1天前