The Business Data Service (BDS) team is a small but critical team of skilled software engineers within Coupa, who are passionate about data. Processes developed by BDS collect data from various sources, both external and internal data, filter that data, and output aggregated data sets for the data consumers.
At BDS you will get familiar with handling large data sets, and you will use the latest data management tools, such as Spark (Pyspark and Pyspark SQL), Jupyter notebooks, MySQL, Metabase (Presto SQL), S3, and likely many others as well. You will be supported by the whole team of engineers who will make sure you will hit the ground running.
BDS is an international team with members in the USA,
Switzerland and India, and we welcome people from all backgrounds.
Role is to join BDS team as a software engineer and take ownership of two main areas: enrichment process reporting and company diversity classification.
1. Enrichment Process Reporting:
To measure and share the effectiveness of the data processing, a reporting solution has been built utilizing Pyspark Jupyter notebooks to pull data from AWS S3 storage, process it, and upload it to Google Drive Spreadsheets. Multiple reporting spreadsheets have been produced to measure important metrics. Your responsibilities will include the maintaining and further development of existing reporting solutions.
2. Company Diversity Classification:
Perhaps the most important single data point BDS provides is the company diversity classifications. In the US especially, the importance of doing business with companies that are minority owned, women owned or hold another diversity classification, has grown massively. BDS offers these classifications to internal and external company records and company lists.
Diversity classification data is collected through 100 automated web scrapers that target diversity classification sources. Collected data is then normalized in a combined database that is used to classify company records.
Your responsibility would be maintaining and developing the processes related to the diversity classification data collection. You would also be expected to learn in depth the meaning and differences of the various diversity classifications to be able to communicate the classification results and to evaluate potential new data sources.
Required skills: Python development - Processes are developed in a Python codebase and having skills with it will make it much easier to get started
SQL queries - You will be dealing with a lot of data tables and knowledge of how to write and interpret SQL queries is important
Fluent English skills - English is the working language of the team and therefore understanding and being understood in both written and spoken English is critical
Preferred skills: Google Drive Spreadsheets: integration to external data sources; reporting and graph experience
Website scraping technologies and techniques (Selenium, PhantomJS, BeautifulSoup, Requests): ability to determine which data points from a data source will be useful and how best to extract that data in a formatted way; parse and extract those data points using HTML structure and regular expressions
Impress this employer describing Your skills and abilities, fill out the form below and leave Your personal touch in the presentation letter.
Job Description The Business Data Service (BDS) team is a small but critical team of skilled software engineers within Coupa, who are passionate about data. Processes developed by BDS collect data fr [...]
Job Description Responsibilities Responsible for the Implementation of the Tasks allocated during the sprint Ensure the software is developed confirming the project architecture, coding standards a [...]
At Aera, we deliver the cognitive technology that enables the Self-Driving Enterprise™: a Cognitive Operating System™ that connects you with your business and autonomously orchestrates your operat [...]
Job Description About VERITAS Veritas solves what no one else can. With an industry-leading product portfolio, we offer proven, modern, and integrated technology that brings together availability, [...]