CS4531 Data Operations and DevSecOps
This graduate-level course provides a comprehensive foundation in data operations, emphasizing the principles, technologies, and best practices essential for modern data management and analytics pipelines. Students will gain deep expertise in data administration standards, data warehousing, and mining techniques, and query structured (SQL) and unstructured (NoSQL) data sets with cloud-hosted tools. The curriculum covers the design and implementation of database architectures, data models, and dictionaries, with hands-on skills in managing, cleaning, transforming, and curating large and complex datasets. The course also explores the structural requirements of databases, automated data pipelines, and data operations, integrating automation scripting and data lifecycle management for scalable artificial intelligence and analytics solutions. Emphasis is placed on gaining experience with DevSecOps and MLOps practices to ensure secure, automated, and reproducible data and workflows throughout the machine-learning lifecycle. Students will learn about storage strategies, format conversions, data extract transfer and load (ETL), and data mapping tools through realistic scenarios and project-based assignments. Prerequisites: Basics of databases: Relations, schemas, keys, normalization, SQL-based query languages, selects and projects, joins, statistical operations, indexing, hashing, updating values, archiving; and programming courses or evidence of solid programming experience.
Prerequisite
Basics of databases: Relations, schemas, keys, normalization, SQL-based query languages, selects and projects, joins, statistical operations, indexing, hashing, updating values, archiving; programming courses or evidence of solid programming experience
Corequisite
None
Lecture Hours
3
Lab Hours
2