Load multiple jobs into one Dagit service instance
If running dagit -f code.py to load a python file, then it always have the one python file to run.
If there are multiple python files, specifying different jobs and ops, how to load all into one dagit instance so it is using the same dagit web UI?
Firstly setup one dagit service
Just run one 'dagit' command as a service. From the web UI, under 'Workspace', you can see and load the project definitions.
The project definition is referenced to by the workspace.ymal file in the project folder as below by default:
load_from:
- python_package: mac_dagster_project
The dagster project 'mac_dagster_project' is created by running command "dagster project scaffold --name my-dagster-project".
On creating a project, it automatically creates a folder structure as below:
--my-dagster-project
----my-dagster-project
------respository.py
----workspace.yaml
----...others
The workspace.yaml is what the dagit command references to for loading project definition.
As it's pointing to the python package "mac_dagster_project" here, it refers to the sub folder "my-dagster-project", which is of the same name of the project. The sub folder keeps the source codes of the project.
The respository.py defines the dagster repositories. e.g.
from dagster import repository
from mac_dagster_project.jobs.myjob import myjob
from mac_dagster_project.schedule import schedule_myjob
@repository
def mac_repo():
return [
myjob,
schedule_myjob,
]
Note here it loads the job module from the jobs sub folder and schedule module from the schedule.py file.
Then it simply list the job and the schedule as parts of the repository.
Now in the jobs sub folder, where an '__init__.py' is added as its recognised as a python module, a python code "myjob.py" defines the job. It can include the ops as well if lazy, otherwise, it can further refer to ops modules defined in another sub folder/file. In this way, every job has its own python file, and it segregates different jobs.
a job definition looks like:
from dagster import job, op
@op
def do_sth():
return 1
@op
def do_sth_else(a):
return 2
@job
def myjob()
a = do_sth()
do_sth_else(a)
Similarly, the schedule.py file just imports the job definition and defines a schedule.
from dagster import ScheduleDefinition
from mac_dagster_project.jobs.myjob import myjob
schedule_myjob = ScheduleDefinition(job=myjob, cron_schedule="0 0 * * *")
The final file structure is like:
--my-dagster-project
----my-dagster-project
------respository.py
------shcedule.py
------jobs
--------myjob.py
----workspace.yaml
----...others