Data Engineer

The Center for Government Excellence is seeking a Data Engineer  who will help create and maintain complex data pipelines from raw acquisition of data to visualization. This main responsibilities of this role will include the design, production, and maintenance of software infrastructure to automatically extract, transform, and load data from diverse sources, input/output data from databases and realize complex queries, creation of scripts for data manipulation, cleaning, filtering, and preparation of data outputs to be used for data visualization or web display.

This will also include creating systems to monitor and guarantee data quality, perform daily data quality assurance tasks, maintain, and troubleshoot the infrastructure, and audit data sources. They will also contribute to explore and analyze data to answer policy-related questions that are crucially needed by local governments. The Data Engineer must be someone with experience developing software for data processing, with knowledge of data analytics, exceptional organizational skills and attention to detail, intellectually curious, and with commitment to unlocking the value of data for the public good.

This position will work in support across all the portfolio of data projects carried out by the GovEx Analytics Team, and in collaboration with Civic Impact’s analysts, data scientists, researchers, data visualization analysts, database managers, program manager, and other external collaborators and partners. They will report to the Center for Government Excellence Program Officer.

Specific Duties & Responsibilities:

Data Engineering (50%)

  • Support the design, production, and maintenance of data pipelines for data acquisition, management, and transformation, and back-end code development to power data web applications and convert raw data into usable information
  • Write and maintain ETL/ELTs that operate on a variety of structured and unstructured sources
  • Develop and maintain web data scraping systems for automatic data acquisition
  • Help design data architecture and provide ongoing support
  • Input/Output data from databases, and perform queries
  • Create scripts to clean, transform, and analyze data
  • Put into production data pipelines using data warehousing systems

Data Quality Assurance (25%)

  • Create and implement into production software to monitor data quality and detect data anomalies
  • Perform daily manual data quality assurance tasks
  • Support, maintain, and troubleshoot the software infrastructure

Data Analysis & Visualization (15%)

  • Source data, conduct analyses, visualize data, and generate insights to support ongoing research projects and other requests across the organization

Other Duties (10%)

  • Collaborate with developers, analysts, data scientists, researchers, policy experts, and other partners
  • Communicate with Civic Impact’s leadership, and others on the team
  • Collaborate with external partners, contractors, and vendors

Minimum Qualifications (Mandatory):

  • Bachelor’s Degree
  • Four years of related experience in a professional environment where the required knowledge base and skills have been utilized, and a minimum of 2 years of data analytics
  • Additional education may substitute for experience and additional experience may be substituted for education, to the extent permitted by the JHU equivalency formula

Preferred Qualifications:

  • Bachelors’ Degree in Computer Science, Information Systems, Engineering, or quantitative discipline
  • Experience developing and managing ETL/ELT processes
  • Experience in Workflow orchestration tools such as Airflow
  • Experience designing and deploying data warehouse systems
  • Experience using project management and collaboration software such as Asana and Slack
  • Experience with version control software (GitHub)
  • Experience coding languages like Python, R, or JavaScript

Special Knowledge, Skills & Abilities:

  • Expertise programming in Python for data acquisition, processing and analysis
  • Expertise with SQL and relational databases
  • Excellent attention to detail and organizational skills
  • Capacity to work independently
  • Commitment to a collegial workplace
  • At least two years’ experience developing python scripts, with object-oriented programming, for data processing and analysis
  • Knowledge of statistics, data analysis, and data quality standards
  • Knowledge of data scraping techniques Familiarity with container management frameworks such as Docker, Kubernetes
  • Familiarity with public sector data and policies