CS4531 Data Operations and DevSecOps

This graduate-level course provides a comprehensive foundation in data operations, emphasizing the principles, technologies, and best practices essential for modern data management and analytics pipelines. Students will gain deep expertise in data administration standards, data warehousing, and mining techniques, and query structured (SQL) and unstructured (NoSQL) data sets with cloud-hosted tools. The curriculum covers the design and implementation of database architectures, data models, and dictionaries, with hands-on skills in managing, cleaning, transforming, and curating large and complex datasets. The course also explores the structural requirements of databases, automated data pipelines, and data operations, integrating automation scripting and data lifecycle management for scalable artificial intelligence and analytics solutions. Emphasis is placed on gaining experience with DevSecOps and MLOps practices to ensure secure, automated, and reproducible data and workflows throughout the machine-learning lifecycle. Students will learn about storage strategies, format conversions, data extract transfer and load (ETL), and data mapping tools through realistic scenarios and project-based assignments. Prerequisites: Basics of databases: Relations, schemas, keys, normalization, SQL-based query languages, selects and projects, joins, statistical operations, indexing, hashing, updating values, archiving; and programming courses or evidence of solid programming experience.


Prerequisite

Basics of databases: Relations, schemas, keys, normalization, SQL-based query languages, selects and projects, joins, statistical operations, indexing, hashing, updating values, archiving; programming courses or evidence of solid programming experience

Corequisite

None

Lecture Hours

3

Lab Hours

2

Course Learning Outcomes

Data Management Principles and Standards

  • Demonstrate knowledge of data administration, standardization policies, and data governance best practices.
  • Evaluate and recommend database technologies and architectures aligned with project and organizational needs.

Database Systems, Warehouses, and Querying

  • Explain and utilize database management systems, including table relationships, views, and physical/virtual storage media.
  • Apply query languages such as SQL to extract, manipulate, and manage data.
  • Design and document data dictionaries and data models to support robust database structures.
  • Apply data warehouse principles and data mining techniques for large-scale analytics.
  • Design and evaluate database structures, data operations, and capacity planning aligned with performance and scalability requirements.

Data Intake and Operations, DevSecOps, MLOps, Traceability

  • Apply best practices for data acquisition, cleaning, transformation, and ingestion to support machine learning and analytics workflows.
  • Build and automate data management pipelines using scripts and architecture tools.
  • Apply DataOps, MLOps, and DevSecOps principles to create secure, automated, and scalable data workflows and ML deployments
  • Discuss tools for data lifecycle management, data versioning, traceability, and continuous integration in ML-involved systems.
  • Monitor data flows and data assets and ensure secure operations across the data ecosystem.
  • Use data mapping tools and perform format conversions to standardize data across systems.