Posts

Showing posts from 2021

Web scraping using BeautifulSoup in Python : EAN number vs price from a German e-commerce website

Image
 Input : List of URLs of product category containing list of products to obtain information from. https://www.duo-shop.de/de-DE/List/4/0/0/ https://www.duo-shop.de/de-DE/List/5/0/0/ https://www.duo-shop.de/de-DE/List/70/0/0/ https://www.duo-shop.de/de-DE/List/259/0/0/ https://www.duo-shop.de/de-DE/List/72/0/0/ https://www.duo-shop.de/de-DE/List/73/0/0/ https://www.duo-shop.de/de-DE/List/9/0/0/ https://www.duo-shop.de/de-DE/List/690/0/0/ https://www.duo-shop.de/de-DE/List/329/0/0/ https://www.duo-shop.de/de-DE/List/537/0/0/ Task : Get EAN number and associated price for each EAN number. Output : Spreadsheet(or CSV) with EAN and price columns. Separate file for each product category. We begin by investigating the website to be scraped from. It's a german e-commerce website selling wide range of product(product type doesn't matter for what we're trying to achieve here). All product listing follow html structure, so if we can get information of one EAN, we can