Kubeflow

Install

Kubeflow + Keycloak, Spark, MLflow 아키텍처. Spark와 MLflow 는 이미 설치가 되어 있으므로 Keycloak과 Kubeflow 설치만 진행한다.

Install Keycloak(Official)

wget -O keycloak.yaml https://raw.githubusercontent.com/keycloak/keycloak-quickstarts/refs/heads/main/kubernetes/keycloak.yaml

kubectl create ns keycloak
kubectl apply -f keycloak.yaml -n keycloak

만약 볼륨 설정이 필요한 경우에는 PVC 설정을 진행해야 한다.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: keycloak
  namespace: keycloak
  annotations:
      alb.ingress.kubernetes.io/scheme: internet-facing
      alb.ingress.kubernetes.io/target-type: ip
      alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
      alb.ingress.kubernetes.io/ssl-redirect: '443'
      alb.ingress.kubernetes.io/healthcheck-port: traffic-port
      alb.ingress.kubernetes.io/healthcheck-path: /
      alb.ingress.kubernetes.io/success-codes: '200'
spec:
  ingressClassName: "mlops-ingress-class"
  rules:
    - host: keycloak.bys.asia
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: keycloak
                port:
                  number: 8080
User: admin
Password: admin

기존에 설치한 Postgresql 을 활용하기 위해 다음과 같이 진행

# 기존 PostgreSQL에 접속
kubectl exec -it postgresql-0 -n postgresql -- psql -U postgres

# Keycloak용 사용자 생성
CREATE USER keycloak WITH PASSWORD 'keycloak';

# Keycloak용 데이터베이스 생성
CREATE DATABASE keycloak;

# 생성한 데이터베이스에 대한 권한 부여
GRANT ALL PRIVILEGES ON DATABASE keycloak TO keycloak;

# keycloak 데이터베이스로 전환
\c keycloak

# keycloak 유저에게 스키마 관련 권한 부여
GRANT ALL ON SCHEMA public TO keycloak;

KC_DB_URL_HOST 환경변수를 postgre 서비스로 변경 keyclaok.yaml

apiVersion: apps/v1
kind: StatefulSet
......
    spec:
      containers:
        - name: keycloak
          image: quay.io/keycloak/keycloak:26.2.5
          args: ["start"]
          env:
......
            # 기존 PostgreSQL 사용을 위한 설정 수정
            - name: 'KC_DB_URL_DATABASE'
              value: 'keycloak'
            - name: 'KC_DB_URL_HOST'
              value: 'postgresql.postgresql.svc.cluster.local'  # 기존 PostgreSQL 서비스 주소
            - name: 'KC_DB'
              value: 'postgres'
            - name: 'KC_DB_PASSWORD'
              value: 'keycloak'  # PostgreSQL에서 생성한 keycloak 사용자의 비밀번호
            - name: 'KC_DB_USERNAME'
              value: 'keycloak'  # PostgreSQL에서 생성한 keycloak 사용자

Setup Keycloak

  1. Create realm
    • Manage realms > Create realm > Realm name: mlops
  2. Create client
    create-client-01.png
    create-client-02.png
    create-client-03.png
    • Client를 생성하고 나면 Client authentication 을 활성화 해야 credential 정보를 생성할 수 있다.
    • URL
      • Valid redirect URIs: https://kubeflow.bys.asia/dex/callback
      • Web origins - https://kubeflow.bys.asia

client-secret.png

  1. Create User and set password Must verified E-mail. If you don’t, you got 500 Internal Server Error after login successfully.

Install Kubeflow(Official)

Kubeflow 를 지원하는 여러 배포판이 존재하지만 그 중 deployKF 를 사용한다.

Install ArgoCD

git clone -b main https://github.com/deployKF/deployKF.git ./deploykf
chmod +x ./deploykf/argocd-plugin/install_argocd.sh
bash ./deploykf/argocd-plugin/install_argocd.sh

argocd Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: argocd
  namespace: argocd
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
    alb.ingress.kubernetes.io/ssl-redirect: '443'
    alb.ingress.kubernetes.io/backend-protocol: HTTPS
    alb.ingress.kubernetes.io/healthcheck-port: traffic-port
    alb.ingress.kubernetes.io/healthcheck-path: /
    alb.ingress.kubernetes.io/success-codes: '200'
spec:
  ingressClassName: "mlops-ingress-class"
  rules:
    - host: argocd.bys.asia
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: argocd-server
                port:
                  number: 443
## Password
USER=admin
PASSWORD=$(kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d)
echo ${USER} ${PASSWORD}

Create Key for argo<->gtihub

ssh-keygen -t ed25519 
# mlops_demo

# Pub key
github

Cretae Github repositories

Connect Repositories
Argocd ssh connect repositories with private key

app-of-apps.yaml

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: deploykf-app-of-apps
  namespace: argocd
  labels:
    app.kubernetes.io/name: deploykf-app-of-apps
    app.kubernetes.io/part-of: deploykf
spec:

  ## NOTE: if not "default", you MUST ALSO set the `argocd.project` value
  project: "default"

  source:
    ## source git repo configuration
    ##  - we use the 'deploykf/deploykf' repo so we can read its 'sample-values.yaml'
    ##    file, but you may use any repo (even one with no files)
    ##
    repoURL: "https://gitlab.bys.asia/bys/argocd-values.git"
    targetRevision: main
    path: "."

    ## plugin configuration
    ##
    plugin:
      name: "deploykf"
      parameters:

        ## the deployKF generator version
        ##  - available versions: https://github.com/deployKF/deployKF/releases
        ##
        - name: "source_version"
          string: "0.1.5"

        ## paths to values files within the `repoURL` repository
        ##  - the values in these files are merged, with later files taking precedence
        ##  - we strongly recommend using 'sample-values.yaml' as the base of your values
        ##    so you can easily upgrade to newer versions of deployKF
        ##
        - name: "values_files"
          array:
            - "./dev-ap2-eks-demo/deploykf/values.yaml"

  destination:
    server: "https://kubernetes.default.svc"
    namespace: "argocd"
  syncPolicy:
    automated:
      prune: false
      selfHeal: false
    syncOptions:
      - CreateNamespace=false

Create ACM

# 도메인에 대한 ACM 신규 생성, 호스트 존은 동일하게 유지 =
*.bys.asia
*.kubeflow.bys.asia

If kubeflow apply in app-of-apps values file

apiVersion: networking.k8s.io/v1
kind: IngressClass
metadata:
  name: default-ingress-class
spec:
  controller: eks.amazonaws.com/alb
  parameters:
    apiGroup: eks.amazonaws.com
    kind: IngressClassParams
    # Use the name of the IngressClassParams set in the previous step
    name: default-ingress-class-params
---
apiVersion: eks.amazonaws.com/v1
kind: IngressClassParams
metadata:
  name: default-ingress-class-params
spec:
  certificateARNs:
  - "arn:aws:acm:ap-northeast-2:558846430793:certificate/" ## ACM 수정 필요(*.kubeflow.bys.asia 도메인이 포함된걸로)
values.yaml (클릭시 내용 확장)
##
##
## This file is intended to be a base for your deployKF values.
## You may either copy and modify it, or include it as a base before your own overrides.
## For an example of overrides, see the 'sample-values-overrides.yaml' in the main repo.
##
## Differences from 'default_values.yaml':
##  - all tools are enabled
##  - secrets are randomly generated at install
##
## Notes:
##  - YAML maps are RECURSIVELY merged across values files
##  - YAML lists are REPLACED in their entirety across values files
##  - Do NOT include empty/null sections, as this will remove ALL values from that section.
##    To include a section without overriding any values, set it to an empty map: `{}`
##  - We don't show all sections/values, only those which are commonly overridden.
##    The full list is available at https://www.deploykf.org/reference/deploykf-values/
##
##

## --------------------------------------------------------------------------------
##
##                                      argocd
##
## --------------------------------------------------------------------------------
argocd:

  ## a prefix to use for argocd application names
  ##  - allows a single instance of argocd to manage deployKF across multiple clusters
  ##  - if non-empty, `argocd.destination` must be a remote cluster, this is because
  ##    a single cluster can only have one instance of deployKF
  ##
  appNamePrefix: ""

  ## the namespace in which argocd is deployed
  ##
  namespace: argocd

  ## the project used for deployKF argocd applications
  ##
  project: default

  ## the source used for deployKF argocd applications
  ##
  source:

    ## configs specifying the git repo which contains your generated manifests
    ##
    repo:
      ## the URL of your manifest git repo
      ##  - for example, if you are using a GitHub repo named 'deployKF/examples', you might set this value
      ##    to "https://github.com/deployKF/examples" or "git@github.com:deployKF/examples.git"
      ##
      url: ""

      ## the git revision which contains your generated manifests
      ##  - for example, if you are using the 'main' branch of your repo, you might set this value to "main"
      ##
      revision: ""

      ## the path within your repo where the generated manifests are stored
      ##  - for example, if you are using a folder named 'GENERATOR_OUTPUT' at the root of your repo,
      ##    you might set this value to "./GENERATOR_OUTPUT/"
      ##
      path: ""

  ## the destination used for deployKF argocd applications
  ##  - the value of `destination.name` takes precedence over `destination.server`
  ##
  destination:
    server: https://kubernetes.default.svc
    name: ""


## --------------------------------------------------------------------------------
##
##                              deploykf-dependencies
##
## --------------------------------------------------------------------------------
deploykf_dependencies:

  ## --------------------------------------
  ##             cert-manager
  ## --------------------------------------
  cert_manager:
    enabled: true
    namespace: cert-manager

    ## extra manifests
    ##  - a list of strings containing extra Kubernetes resource manifests
    ##
    extraManifests: []

    ## istio gateway certificate issuer configs
    ##  - if you wish to use your own ClusterIssuer, set `clusterIssuer.enabled` to false
    ##    and set `clusterIssuer.issuerName` to the name of your issuer, (this still works when you
    ##    bring your own cert-manager deployment by setting `cert_manager.enabled` to false)
    ##
    clusterIssuer:
      enabled: true
      issuerName: deploykf-gateway-issuer


  ## --------------------------------------
  ##                 istio
  ## --------------------------------------
  istio:
    enabled: true
    namespace: istio-system


  ## --------------------------------------
  ##                kyverno
  ## --------------------------------------
  kyverno:
    enabled: true
    namespace: kyverno


## --------------------------------------------------------------------------------
##
##                                  deploykf-core
##
## --------------------------------------------------------------------------------
deploykf_core:

  ## --------------------------------------
  ##             deploykf-auth
  ## --------------------------------------
  deploykf_auth:
    namespace: deploykf-auth

    ## dex configs
    ##
    dex:

      ## dex static passwords
      ##  - a list of users to create in dex's built-in password database
      ##  - each element is a map with keys `email` and `password`,
      ##    the `password` is a map with the following keys:
      ##     - `value`: the password value
      ##     - `existingSecret`: the name of a kubernetes secret containing the password (overrides `value`)
      ##     - `existingSecretKey`: the key in the secret that contains the password
      ##     - `type`: how the password is provided (default: "plain")
      ##         - "plain": the password is provided as plain text
      ##         - "hash": the password is provided as a bcrypt hash
      ##  - a bcrypt hash for "PASSWORD_STRING" can be generated with one of the following:
      ##     - echo "PASSWORD_STRING" | htpasswd -BinC 10 NULL | cut -d: -f2
      ##     - python -c 'import bcrypt; print(bcrypt.hashpw(b"PASSWORD_STRING", bcrypt.gensalt(10)).decode())'
      ##
      staticPasswords:
       - email: "bys@amazon.com"
         password:
           value: "bys"
      #  - email: "user2@example.com"
      #    password:
      #      value: "user2"

      ## dex connectors
      ##  - dex connectors which allow bridging trust to external identity providers
      ##    https://dexidp.io/docs/connectors/
      ##  - not all connector types support refresh tokens, notably "SAML 2.0" and "OAUTH 2.0" do not
      ##    however, most providers support "OpenID Connect" which does support refresh tokens
      ##    without refresh tokens, users will be forced to re-authenticate every `expiry.idToken` period
      ##  - each element is a map with keys `type`, `id`, `name`, and `config` (which are the same aas upstream dex)
      ##    additionally, `configExistingSecret` and `configExistingSecretKey` allow you to set `config`
      ##    from a YAML-formatted string in a kubernetes secret
      ##  - in most cases `config.redirectURI` will be set to "https://{DEPLOYKF_HOST}/dex/callback" (if port is 443)
      ##
      connectors:
        - type: oidc
          id: keycloak
          name: Keycloak
          config:
            issuer: https://keycloak.bys.bys.asia/realms/mlops

            clientID: "mlops-client"
            clientSecret: "GUZRQUzPOO7HyVWRL8tFPFwDKWVDX0uq"

            ## replace with your deploykf domain
            ## NOTE: this must be an allowed redirect URI in the Keycloak app
            redirectURI: https://kubeflow.bys.bys.asia/dex/callback

            ## openid scopes to request
            scopes:
              - openid
              - email
              - profile
              ## NOTE: offline_access is required for refresh tokens
              - offline_access

            ## keycloak does not always send the `email_verified` claim
            insecureSkipEmailVerified: true

            ## if your Keycloak uses a self-signed certificate
            #insecureSkipVerify: true

      ## dex OpenID Connect clients
      ##
      clients:
        ## OpenID client for oauth2-proxy (deployKF Dashboard)
        oauth2Proxy:
          clientId: "oauth2-proxy"
          clientSecret:
            existingSecret: "generated--dex-oauth2-proxy-client"
            existingSecretKey: "client_secret"
            generateSecret: true

        ## OpenID client for Minio Console
        minioConsole:
          clientId: "minio-console"
          clientSecret:
            existingSecret: "generated--dex-minio-console-client"
            existingSecretKey: "client_secret"
            generateSecret: true

        ## OpenID client for Argo Server
        argoServer:
          clientId: "argo-server"
          clientSecret:
            existingSecret: "generated--dex-argo-server-client"
            existingSecretKey: "client_secret"
            generateSecret: true

    ## oauth2-proxy configs
    ##
    oauth2Proxy:

      ## oauth2-proxy cookie configs
      ##
      cookie:
        secret:
          existingSecret: "generated--oauth2-proxy-cookie-secret"
          existingSecretKey: "cookie_secret"
          generateSecret: true


  ## --------------------------------------
  ##          deploykf-dashboard
  ## --------------------------------------
  deploykf_dashboard:
    namespace: deploykf-dashboard


  ## --------------------------------------
  ##        deploykf-istio-gateway
  ## --------------------------------------
  deploykf_istio_gateway:
    namespace: deploykf-istio-gateway
    ## extra manifests
    ##  - a list of strings containing extra Kubernetes resource manifests
    ##
    extraManifests:
      - |
        apiVersion: networking.k8s.io/v1
        kind: Ingress
        metadata:
          name: deploykf-kubeflow-gateway
          annotations:
            alb.ingress.kubernetes.io/scheme : internet-facing
            alb.ingress.kubernetes.io/target-type: ip
            alb.ingress.kubernetes.io/backend-protocol: HTTPS
            alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
            alb.ingress.kubernetes.io/ssl-redirect: '443'
            alb.ingress.kubernetes.io/healthcheck-port: "status-port"
            alb.ingress.kubernetes.io/healthcheck-protocol: HTTP
            alb.ingress.kubernetes.io/healthcheck-path: "/healthz/ready"
            alb.ingress.kubernetes.io/healthcheck-interval-seconds: '15'
            alb.ingress.kubernetes.io/healthcheck-timeout-seconds: '10'
            alb.ingress.kubernetes.io/healthy-threshold-count: '2'
            alb.ingress.kubernetes.io/unhealthy-threshold-count: '4'
            alb.ingress.kubernetes.io/success-codes: 200,302
        spec:
          ingressClassName: "default-ingress-class"
          rules:
            - host: "kubeflow.bys.bys.asia"
              http:
                paths:
                  - path: "/"
                    pathType: Prefix
                    backend:
                      service:
                        name: "deploykf-gateway"
                        port:
                          number: 443
            - host: "*.kubeflow.bys.bys.asia"
              http:
                paths:
                  - path: "/"
                    pathType: Prefix
                    backend:
                      service:
                        name: "deploykf-gateway"
                        port:
                          number: 443

    ## istio gateway configs
    ##
    gateway:

      ## the hostname that the gateway will listen on
      ##  - subdomains of this hostname may also be used, depending on which apps are enabled
      ##
      hostname: kubeflow.bys.bys.asia

      ## the ports that the gateway will listen on
      ##  - these are the "internal" ports which the gateway use, and can be different
      ##    to the user-facing ports which the service listens on, see `gatewayService.ports`
      ##
      ports:
        http: 80
        https: 443

      tls:
        ## ALB does NOT forward the SNI after TLS termination, 
        ## so we must disable SNI matching in the gateway
        matchSNI: false
        redirect: false

      ## the pod labels used by the gateway to find the ingress gateway deployment
      ##
      selectorLabels:
        app: deploykf-gateway
        istio: deploykf-gateway

    ## istio gateway deployment configs
    ##
    gatewayDeployment:
      serviceAccount:
        name: deploykf-gateway
        annotations: {}

    ## istio gateway service configs
    ##
    gatewayService:
      name: deploykf-gateway
      annotations: {}
      type: ClusterIP


  ## --------------------------------------
  ##      deploykf-profiles-generator
  ## --------------------------------------
  deploykf_profiles_generator:

    ## profile defaults
    ##
    profileDefaults:

      ## a common prefix to add to all profile names
      ##
      profileNamePrefix: ""

      ## the default access for members of profiles, when not explicitly specified
      ##  - `role`: the Kubernetes RBAC role to bind to the user in the profile namespace
      ##      - "edit": binds "ClusterRole/kubeflow-edit" (can view/create/delete resources)
      ##      - "view": binds "ClusterRole/kubeflow-view" (cam view resources)
      ##  - `notebooksAccess`: if the user can ~connect~ to kubeflow notebooks in the profile
      ##                       note, the ability to create/delete notebook resources is controlled by `role`
      ##
      memberAccess:
        role: view
        notebooksAccess: false

      ## the default list of plugins for profiles, when not explicitly specified
      ##  - each entry is a map with the following keys:
      ##     - `kind`: the kind of plugin
      ##         - "AwsIamForServiceAccount": manages AWS IRSA for the profile namespace
      ##         - "WorkloadIdentity": manages GCP WorkloadIdentity for the profile namespace
      ##     - `spec`: a map of plugin-specific configurations
      ##         - spec for AwsIamForServiceAccount:
      ##           https://github.com/kubeflow/kubeflow/blob/v1.7.0/components/profile-controller/controllers/plugin_iam.go#L30
      ##         - spec for WorkloadIdentity:
      ##           https://github.com/kubeflow/kubeflow/blob/v1.7.0/components/profile-controller/controllers/plugin_workload_identity.go#L39
      ##  - override these defaults for a specific profile by setting `plugins` in that profile,
      ##    to disable all plugins for that profile, set an empty list `[]`
      ##
      plugins: []

      ## the default resource quota for profiles, when not explicitly specified
      ##  - spec for ResourceQuotaSpec:
      ##    https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#resourcequotaspec-v1-core
      ##  - override these defaults for a specific profile by setting `resourceQuotaSpec` in that profile,
      ##    to disable resource quotas for that profile, set an empty map `{}`
      ##
      resourceQuotaSpec: {}

      ## the default tool configs for profiles
      ##
      tools:

        ## the default Kubeflow Pipelines configs for profiles
        ##
        kubeflowPipelines:

          ## the default Kubeflow Pipelines object store auth configs for profiles
          ##  - the behaviour of these configs depends on `kubeflow_tools.kubeflow_pipelines.objectStore`
          ##  - override these defaults for a specific profile by setting `tools.kubeflowPipelines.objectStoreAuth` in that profile
          ##
          objectStoreAuth:
            existingSecret: "kubeflow-pipelines--profile-object-store-auth--{profile_name}"
            existingSecretNamespace: ""
            existingSecretAccessKeyKey: "access_key"
            existingSecretSecretKeyKey: "secret_key"

    ## user entities
    ##  - a list of users that can be referenced when defining profile members
    ##  - each element is a map with the following keys:
    ##     - `id`: a unique identifier for the user
    ##     - `email`: the email of the user (must exactly match the email from your identity provider)
    ##
    users:
     - id: bys
       email: "bys@amazon.com"
     - id: test
       email: "bys+test@amazon.com"

    ## group entities
    ##  - a list of groups that can be referenced when defining profile members
    ##  - each element is a map with the following keys:
    ##     - `id`: a unique identifier for the group
    ##     - `users`: a list of user IDs that are members of the group
    ##
    groups:
     - id: admin-group
       users:
         - bys
     - id: test-group
       users:
         - bys
         - test

    ## profile definitions
    ##  - a list of profile definitions to be generated
    ##  - each element is a map with the following keys:
    ##     - `name`: the name of the profile (must be unique)
    ##               note, the name of a profile is also its namespace name
    ##     - `members`: a list of members and their access to this profile (default: [])
    ##                  note, if a user appears in multiple memberships, the most permissive access is used
    ##          each element is a map with the following keys:
    ##           - `user`: the ID of a user (mutually exclusive with `group`)
    ##           - `group`: the ID of a group (mutually exclusive with `user`)
    ##           - `access`: a map with configs for this member's access to the profile:
    ##               - `role`: the Kubernetes RBAC role to bind to the user (default: `profileDefaults.memberAccess.role`)
    ##               - `notebooksAccess`: if the user can ~connect~ to kubeflow notebooks (default: `profileDefaults.memberAccess.notebooksAccess`)
    ##     - `plugins`: the list of plugins for this profile (default: `profileDefaults.plugins`)
    ##     - `resourceQuotaSpec`: the resource quota for this profile (default: `profileDefaults.resourceQuotaSpec`)
    ##     - `tools`: a map with configs for tools:
    ##         - `kubeflowPipelines`: a map with configs for Kubeflow Pipelines:
    ##             - `objectStoreAuth`: a map with configs for object store auth (default: `profileDefaults.tools.kubeflowPipelines.objectStoreAuth`)
    ##                 - `existingSecret`: the name of an existing kubernetes secret
    ##                 - `existingSecretNamespace`: the namespace containing the kubernetes secret (default: namespace of this profile)
    ##                 - `existingSecretAccessKeyKey`: the key within the secret that contains the access-key (default: "access_key")
    ##                 - `existingSecretSecretKeyKey`: the key within the secret that contains the secret-key (default: "secret_key")
    ##
    profiles:
     - name: admin-profile
       members:
         - group: admin-group
           access:
             role: edit
             notebooksAccess: true
     - name: test-profile
       members:
         - group: test-group
           access:
             role: view
             notebooksAccess: true
    #
    #  - name: team-1-prod
    #    members:
    #      - group: team-1
    #        access:
    #          role: view
    #          notebooksAccess: false


## --------------------------------------------------------------------------------
##
##                                   deploykf-opt
##
## --------------------------------------------------------------------------------
deploykf_opt:

  ## --------------------------------------
  ##            deploykf-minio
  ## --------------------------------------
  deploykf_minio:
    enabled: true
    namespace: deploykf-minio

    ## root user configs for minio
    ##
    rootUser:
      existingSecret: "generated--deploykf-minio-root-user"
      existingSecretUsernameKey: "username"
      existingSecretPasswordKey: "password"
      generateSecret: true

      ## service accounts for the root user
      ##  - these service accounts are created and/or updated by a post-install job
      ##  - each element in the list is a map with the following fields:
      ##     - `accessKey`: the access-key for the service account
      ##     - `secretKey`: the secret-key for the service account
      ##     - `existingSecret`: the name of an existing secret containing the access & secret keys
      ##     - `existingSecretAccessKeyKey`: the key in the secret containing the access-key (default: "access_key")
      ##     - `existingSecretSecretKeyKey`: the key in the secret containing the secret-key (default: "secret_key")
      ##     - `generateSecret`: if true, random keys are generated and stored in `existingSecret` (default: false)
      ##                         note, `existingSecret` must be set to a unique value for each service account
      ##                         note, the job will fail if the secret already exists in the cluster
      ##     - `policy`: the minio policy document as a YAML map, ~not a string~ (default: empty/root-access)
      ##                 https://min.io/docs/minio/container/administration/identity-access-management/policy-based-access-control.html
      ##  - [WARNING] if a `policy` is not specified, the service account will have root access
      ##  - [WARNING] unlisted minio service accounts will be removed from minio
      ##  - [WARNING] unlisted kubernetes secrets with this label will be removed from the cluster:
      ##              "deploykf-minio.deploykf.org/generated-minio-root-service-account: true"
      ##
      serviceAccounts: []

    ## identity configs
    ##
    identity:

      ## OpenID Connect configs (connects to `deploykf-auth` dex)
      ##
      openid:
        ## sets `MINIO_IDENTITY_OPENID_CLAIM_NAME`
        ##  - if set to "email", access `policies` are automatically generated for each user
        ##    based on their `access.role` in each profile
        ##
        policyClaim: "email"

    ## minio buckets
    ##  - these buckets are created and/or updated by a post-install job
    ##  - each element is a map with the following keys:
    ##     - `name`: the name of the bucket
    ##     - `versioning`: the name of the policy to apply to the bucket
    ##  - if Kubeflow Pipelines is enabled, a bucket named `kubeflow_tools.pipelines.bucket.name`
    ##    is automatically added to this list, with `versioning` disabled
    ##
    buckets: []

    ## minio access policies
    ##  - [WARNING] existing policies that have "@" in their name will be removed from minio if they are not listed here
    ##  - these policies are created and/or updated by a post-install job
    ##  - each element is a map with the following keys:
    ##     - `name`: the name of the policy
    ##     - `policy`: the minio policy document as a YAML map, ~not a string~
    ##                 https://min.io/docs/minio/container/administration/identity-access-management/policy-based-access-control.html
    ##  - if Kubeflow Pipelines is enabled, and `identity.openid.policyClaim` is set to "email",
    ##    policies are automatically generated for each user based on their `access.role` in each profile
    ##
    policies: []

  ## --------------------------------------
  ##            deploykf-mysql
  ## --------------------------------------
  deploykf_mysql:
    enabled: true
    namespace: deploykf-mysql

    ## configs for the "root@localhost" mysql user
    ##  - these credentials are used by the liveness probes
    ##  - as this is "root@localhost", these credentials can only be used from within the pod
    ##
    rootUser:
      existingSecret: "generated--deploykf-mysql-root-user"
      existingSecretPasswordKey: "password"
      generateSecret: true

    ## configs for the kubeflow mysql user
    ##  - if a Kubeflow app requires MySQL and is not configured to use an external database,
    ##    it will use these credentials
    ##
    kubeflowUser:
      existingSecret: "generated--deploykf-mysql-kubeflow-user"
      existingSecretUsernameKey: "username"
      existingSecretPasswordKey: "password"
      generateSecret: true


## --------------------------------------------------------------------------------
##
##                              kubeflow-dependencies
##
## --------------------------------------------------------------------------------
kubeflow_dependencies:

  ## --------------------------------------
  ##        kubeflow-argo-workflows
  ## --------------------------------------
  kubeflow_argo_workflows:
    enabled: true
    namespace: kubeflow-argo-workflows


## --------------------------------------------------------------------------------
##
##                                  kubeflow-tools
##
## --------------------------------------------------------------------------------
kubeflow_tools:

  ## --------------------------------------
  ##                 katib
  ## --------------------------------------
  katib:
    enabled: true

    ## mysql connection configs
    ##  - if `useExternal` is true, katib will use the specified external mysql database
    ##  - if `useExternal` is false, katib will use the embedded `deploykf_opt.deploykf_mysql` database,
    ##    and all other configs will be ignored
    ##
    mysql:
      useExternal: false
      host: "mysql.example.com"
      port: 3306
      auth:
        username: kubeflow
        password: password
        existingSecret: ""
        existingSecretUsernameKey: "username"
        existingSecretPasswordKey: "password"

    ## mysql database name
    ##
    mysqlDatabase: katib


  ## --------------------------------------
  ##               notebooks
  ## --------------------------------------
  notebooks:
    enabled: true

    ## notebook spawner configs
    ##  - these configs directly become the `spawner_ui_config.yaml` in the jupyter-web-app
    ##
    spawnerFormDefaults: {}


  ## --------------------------------------
  ##               pipelines
  ## --------------------------------------
  pipelines:
    enabled: true

    ## storage bucket configs
    ##
    bucket:
      name: kubeflow-pipelines
      region: ""

    ## object store configs
    ##  - if `useExternal` is true, pipelines will use the specified external object store
    ##  - if `useExternal` is false, pipelines will use the embedded `deploykf_opt.deploykf_minio` object store,
    ##    and all other configs will be ignored
    ##
    objectStore:
      useExternal: false
      host: s3.amazonaws.com
      port: ""
      useSSL: true

      ## object store auth configs
      ##  - https://www.deploykf.org/guides/external/object-store/#connect-an-external-object-store
      ##
      auth:
        fromEnv: false
        accessKey: my-access-key
        secretKey: my-secret-key
        existingSecret: ""
        existingSecretAccessKeyKey: "AWS_ACCESS_KEY_ID"
        existingSecretSecretKeyKey: "AWS_SECRET_ACCESS_KEY"

    ## mysql connection configs
    ##  - if `useExternal` is true, pipelines will use the specified external mysql database
    ##  - if `useExternal` is false, pipelines will use the embedded `deploykf_opt.deploykf_mysql` database,
    ##    and all other configs will be ignored
    ##
    mysql:
      useExternal: false
      host: "mysql.example.com"
      port: 3306
      auth:
        username: kubeflow
        password: password
        existingSecret: ""
        existingSecretUsernameKey: "username"
        existingSecretPasswordKey: "password"

    ## mysql database names
    ##
    mysqlDatabases:
      cacheDatabase: kfp_cache
      metadataDatabase: kfp_metadata
      pipelinesDatabase: kfp_pipelines

    ## profile resource generation configs
    ##
    profileResourceGeneration:

      ## if a PodDefault named "kubeflow-pipelines-api-token" should be generated in each profile namespace
      ##  - the generated PodDefault will mount a serviceAccountToken volume which can be used to authenticate
      ##    with the Kubeflow Pipelines API on Pods which have a `kubeflow-pipelines-api-token` label with value "true"
      ##  - for more information, see the "Full Kubeflow (from inside cluster)" section of the following page:
      ##    https://www.kubeflow.org/docs/components/pipelines/v1/sdk/connect-api/
      ##  - the PodDefault will NOT be generated if `kubeflow_tools.poddefaults_webhook.enabled` is false,
      ##    regardless of this setting
      ##
      kfpApiTokenPodDefault: false


  ## --------------------------------------
  ##          poddefaults-webhook
  ## --------------------------------------
  poddefaults_webhook:
    enabled: true


  ## --------------------------------------
  ##             tensorboards
  ## --------------------------------------
  tensorboards:
    enabled: true


  ## --------------------------------------
  ##           training-operator
  ## --------------------------------------
  training_operator:
    enabled: false


  ## --------------------------------------
  ##                volumes
  ## --------------------------------------
  volumes:
    enabled: true

Deploy Kubeflow

kubectl apply -f app-of-apps.yaml

Sync Application

# download the latest version of the script
curl -fL -o "sync_argocd_apps.sh" \
  "https://raw.githubusercontent.com/deployKF/deployKF/main/scripts/sync_argocd_apps.sh"

# ensure the script is executable
chmod +x ./sync_argocd_apps.sh

# ensure your kubectl context is set correctly
kubectl config current-context

# run the script
bash ./sync_argocd_apps.sh --grpc-web

Domain


Keycloak Authorization 동작

설정이 완료되고 minio-console 도메인에 접속한 후, keycloak 인증을 통해 접속을 하면 어떤 정책을 부여 받을지는 MINIO_IDENTITY_OPENID_CLAIM_NAME 환경변수에 적용되어 있다. 즉, Keycloak에서 제공한 사용자의 이메일 주소를 기반으로 MinIO에서 해당 사용자를 식별하여 정책을 부여하겠다는 의미이다.

MINIO_IDENTITY_OPENID_CLAIM_NAME=email
  1. Keycloak 유저의 E-mail을 확인
  2. deploykf-profiles-generator 에 맵핑된 E-mail 프로파일의 access.role 에 따라 정책을 부여한다.
  3. bys 예시를 확인하면 admin-profile 에 대해서는 access.role: edit 을 가지고 있으므로, Put, Delete가 가능하지만, test-profile 에 대해서는 access.role: view를 가지고 있으므로, Get 만 가능하다.
    설정된 MinIO Policy
    {
     "Version": "2012-10-17",
     "Statement": [
         {
             "Effect": "Allow",
             "Action": [
                 "s3:ListBucket",
                 "s3:GetBucketLocation"
             ],
             "Resource": [
                 "arn:aws:s3:::kubeflow-pipelines"
             ]
         },
         {
             "Effect": "Allow",
             "Action": [
                 "s3:DeleteObject",
                 "s3:GetObject",
                 "s3:PutObject"
             ],
             "Resource": [
                 "arn:aws:s3:::kubeflow-pipelines/artifacts/admin-profile/*",
                 "arn:aws:s3:::kubeflow-pipelines/v2/artifacts/admin-profile/*"
             ]
         },
         {
             "Effect": "Allow",
             "Action": [
                 "s3:GetObject"
             ],
             "Resource": [
                 "arn:aws:s3:::kubeflow-pipelines/artifacts/test-profile/*",
                 "arn:aws:s3:::kubeflow-pipelines/v2/artifacts/test-profile/*"
             ]
         }
     ]
    }
    

KFP 파이프라인을 수행할 때 Keycloak 으로 부터 토큰을 받아와서 kfp 인증을 수행해야 할 때가 있는데 keycloak_token.py 와 같이 코드에서 접근하려면 Keycloak 클라이언트에서 Direct access grants 설정이 활성화 되어야 한다.

from kfp import Client
from keycloak_token import get_keycloak_token
        
token = get_keycloak_token()

client = Client(
                host='https://kubeflow.bys.asia',
                existing_token=token
            )

keycloak_token.py

from keycloak import KeycloakOpenID

def get_keycloak_token():
    """Keycloak에서 액세스 토큰을 가져오는 함수"""
    keycloak_url = "https://keycloak.bys.asia"
    realm_name = "mlops"
    client_id = "bys-keycloak-client"
    client_secret = ""
    
    # Configure client
    keycloak_openid = KeycloakOpenID(
        keycloak_url=keycloak_url,
        realm_name=realm_name,
        client_id=client_id,
        client_secret_key=client_secret
    )
    
    try:
        # Get Token
        keycloakUser="bys"
        keycloakPassword=""
        token = keycloak_openid.token(keycloakUser, keycloakPassword)
        access_token = token['access_token']
        return access_token
    except Exception as e:
        print(f"Error getting Keycloak token: {e}")
        return None

Kubeflow Component

Notebooks

TensorBoards

Pipeline

Trainer

Argo Workflow

MinIO