Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splunk Operator: There is no way to provide custom default setiings to search heads and deployer. #1048

Open
yaroslav-nakonechnikov opened this issue Jan 26, 2023 · 6 comments
Assignees

Comments

@yaroslav-nakonechnikov
Copy link

Please select the type of request

Bug

Tell us more

Describe the request
At the moment definition to create SearchHeads is written in single CRD, which creates 4 pods at minimum:

  • 1 deployer
  • 3 nodes
    this is quite fine, but problem comes when it is needed to install custom apps or to provide custom configuration.
    Real example what should be custom: authentication.conf.

we are using PingID integration, which requires to define fqdn setting, where browser redirects after successful login. By default, hostname is being used, which is not accessible from client PC's.

and if we provide defaults.yml with correct setup of PindID config with fqdn - here problem arises: only one domain name is defined.

Expected behavior
There should be way to define default config for searchhead deployer and custom config for searchhead nodes.

Splunk setup on K8S
EKS on AWS

Reproduction/Testing steps
with next defaults.yml submited to searchheads crd - only one fqdn is possible to use.

splunk:
  app_paths_install:
    default:
      - https://proxy/artifactory/prj-raw-host/splunk/splunk-apps/config-explorer_1715.tgz
      - https://proxy/artifactory/prj-raw-host/splunk/splunk-apps/splunk-datasets-add-on_10.tgz
      - https://proxy/artifactory/prj-raw-host/splunk/splunk-apps/splunk-app-for-lookup-file-editing_360.tgz
  conf:
    - key: deploymentclient
      value:
        directory: /opt/splunk/etc/system/local
        content:
          deployment-client :
            disabled : false
          target-broker:deploymentServer :
            targetUri : deployment-server.prj-eks-dev.cmp-prj-dev.internal.cmpgroup.cloud:8089
     - key: authentication
       value:
         directory: /opt/splunk/etc/system/local
         content:
           authentication:
             authSettings : saml
             authType : SAML
           saml:
             entityId : splunkACSEntityId
             fqdn : https://shc-deployer.26981.cmp-prj-dev.internal.cmpgroup.cloud
             idpSSOUrl : https://idp.host.com/idp/SSO.saml2
             inboundDigestMethod : SHA1;SHA256;SHA384;SHA512
             inboundSignatureAlgorithm : RSA-SHA1;RSA-SHA256;RSA-SHA384;RSA-SHA512
             issuerId : idp:host.com:saml2
             lockRoleToFullDN : true
             redirectAfterLogoutToUrl : https://www.splunk.com
             redirectPort : 443
             replicateCertificates : true
             signAuthnRequest : true
             signatureAlgorithm : RSA-SHA1
             signedAssertion : true
             sloBinding : HTTP-POST
             ssoBinding : HTTP-POST
             clientCert : /mnt/certs/saml_sig.pem
             idpCertPath: /mnt/certs/
           roleMap_SAML:
             admin : cmp-aws-s-eng-admin;aws-s-eng-admin

So, problem, that we need to know how to provide:
fqdn : https://shc-deployer.26981.cmp-prj-dev.internal.cmpgroup.cloud - for deployer
fqdn : https://shc.26981.cmp-prj-dev.internal.cmpgroup.cloud - for searcheads

@yaroslav-nakonechnikov
Copy link
Author

according to documentation: https://splunk.github.io/docker-splunk/ADVANCED.html#:~:text=The%20purpose%20of%20the%20default.yml%20is%20to%20define,members%20of%20the%20cluster%20%28ex.%20keys%2C%20passwords%2C%20secrets%29. : there is nice example:
password: "{{ splunk_password | default(<password>) }}"
so i thought some jinja functions should work...

and trying to do something like:
fqdn : "{% if getenv("SPLUNK_ROLE") == "splunk_search_head" %}https://shc.${splunk_domain}{% else %}https://shc-deployer.${splunk_domain}{% endif %}"
and it is not working, because of:

yaml.scanner.ScannerError: while scanning for the next token found character '%' that cannot start any token   in "<unicode string>", line 27, column 21:
fqdn : {% if getenv("SPLUNK_ROLE") == "s ...                          ^
[WARNING]:  * Failed to parse /opt/ansible/inventory/environ.py with ini plugin: /opt/ansible/inventory/environ.py:16: Expected key=value host variable assignment, got: __future__
[WARNING]: Unable to parse /opt/ansible/inventory/environ.py as an inventory source

@yaroslav-nakonechnikov
Copy link
Author

i found workaround:

fqdn : >
              {% if lookup('ansible.builtin.env', 'SPLUNK_ROLE') == "splunk_search_head" %}
                https://she.${splunk_domain}
              {% else %}
                https://she-deployer.${splunk_domain}
              {% endif %}

but, i would like to know if there any better official way to do that

@akondur
Copy link
Collaborator

akondur commented Jan 30, 2023

CSPL-2152

@yaroslav-nakonechnikov
Copy link
Author

this one becomes critical.

real case:
when there is a list of apps to be installed, deployer requires a lot of time to make pod in Running state.
and defining StartupProbe with big timeout - affects also searchead nodes, which leads that sh nodes can't get IP assigned, till startup probe will start to work.

if increase threshold - it will lead to another issue, that real issue won't be detected fast enough.

@yaroslav-nakonechnikov
Copy link
Author

related: splunk/splunk-ansible#784

@yaroslav-nakonechnikov
Copy link
Author

and another thing found.

when deployer starts, it connects to deployment server and download apps. In that time nodes are passing further and then deployer stucks on:

TASK [splunk_deployer : Wait for SHC to be ready] ******************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: Exception: SHC failure, setup notcomplete. online_peers:['05BB34E3-9B8F-4916-A60A-493D4534047F', 'B7CE74CB-138D-4E4F-9A6C-B4DB791C155D']
fatal: [localhost]: FAILED! => {
    "attempts": 60,
    "changed": false,
    "rc": 1
}
 
MSG:
 
MODULE FAILURE
See stdout/stderr for the exact error
 
 
MODULE_STDERR:
 
Traceback (most recent call last):
  File "/home/splunk/.ansible/tmp/ansible-tmp-1709656969.8714278-4953-235691734405253/AnsiballZ_shc_ready.py", line 100, in <module>
    _ansiballz_main()
  File "/home/splunk/.ansible/tmp/ansible-tmp-1709656969.8714278-4953-235691734405253/AnsiballZ_shc_ready.py", line 92, in _ansiballz_main
    invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)
  File "/home/splunk/.ansible/tmp/ansible-tmp-1709656969.8714278-4953-235691734405253/AnsiballZ_shc_ready.py", line 41, in invoke_module
    run_name='__main__', alter_sys=True)
  File "/usr/lib/python3.7/runpy.py", line 205, in run_module
    return _run_module_code(code, init_globals, run_name, mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/ansible_shc_ready_payload_nh5z9sh5/ansible_shc_ready_payload.zip/ansible/modules/shc_ready.py", line 55, in <module>
  File "/tmp/ansible_shc_ready_payload_nh5z9sh5/ansible_shc_ready_payload.zip/ansible/modules/shc_ready.py", line 50, in main
  File "/tmp/ansible_shc_ready_payload_nh5z9sh5/ansible_shc_ready_payload.zip/ansible/modules/shc_ready.py", line 37, in run
Exception: SHC failure, setup not complete. online_peers:['05BB34E3-9B8F-4916-A60A-493D4534047F', 'B7CE74CB-138D-4E4F-9A6C-B4DB791C155D']
 
 
PLAY RECAP *********************************************************************
localhost                  : ok=137  changed=20   unreachable=0    failed=1    skipped=64   rescued=0    ignored=0

executing splunk resync shcluster-replicated-config manually on deployer allows to pass this check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants