A minimal hardcoded definition of the structure: probably a few hundred lines.
The actual definition, including reusable components, optional features, and flexibility for experimentation: probably a few thousand.
The code needed to train the model, including all the data pipelines and management, training framework, optimization tricks, etc.: tens of thousands.
The whole codebase, including experiments, training/inference monitoring, modules that didn't make it into the final architecture, unit tests, and all custom code written to support everything mentioned so far: hundreds of thousands.
The actual definition, including reusable components, optional features, and flexibility for experimentation: probably a few thousand.
The code needed to train the model, including all the data pipelines and management, training framework, optimization tricks, etc.: tens of thousands.
The whole codebase, including experiments, training/inference monitoring, modules that didn't make it into the final architecture, unit tests, and all custom code written to support everything mentioned so far: hundreds of thousands.