| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Return code 200 of POST method request must be dealt as success.
Newly required due to the SFTP API change using POST.
Related to: #2764
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change API version from v1 to v2, which includes:
- change of URL
- different URI
- POST method for token generation instead of GET
Resolves: #2764
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
When case_id is not supplied, we ask SFTP server to store the uploaded
file under name /var/tmp/<tarball>, which is confusing.
Let remove the path from it also in case_id not supplied.
Related to: #2764
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
| |
SoSCollector does not further declare get_upload_url method
as that was moved under Policy class(es).
Resolves: #2766
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
| |
It was discovered that our extra handling for shortnames was
unintentionally case sensitive. Fix this to ensure that shortnames are
obfuscated regardless of case in all collected text.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up until now, our sourcing of hostnames/domains for obfuscation has been
dependent upon the output of the `hostname` command. However, some
scenarios have come up where sourcing `/etc/hosts` is advantageous for
several reasons:
First, if `hostname` output is unavailable, this provides a fallback
measure.
Second, `/etc/hosts` is a common place to have short names defined which
would otherwise not be detected (or at the very least would result in a
race condition based on where/if the short name was elsewhere able to be
gleaned from an FQDN), thus leaving the potential for unobfuscated data
in an archive.
Due to both the nature of hostname obfuscation and the malleable syntax
of `/etc/hosts`, the parsing of this file needs special handling not
covered by our more generic parsing and obfuscation methods.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch is to update nvidia plugin to collect
logs for Nvidia GPUs
Signed-off-by: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
Reported-by: Borislav Stoymirski <borislav.stoymirski@bg.ibm.com>
Reported-by: Yesenia Jimenez <yesenia@us.ibm.com>
|
|
|
|
| |
Signed-off-by: Michael Cambria <mcambria@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sos report on OCP having hundreds of namespaces timeouts in networking
plugin, as it collects >10 commands for each namespace.
Let use a balanced approach in:
- increasing network.timeout
- limiting namespaces to traverse
- disabling ethtool per namespace
to ensure sos report successfully finish in a reasonable time,
collecting rasonable amount of data.
Resolves: #2754
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enhance --estimate-mode to calculate sizes of also:
- symlinks
- directories themselves
- manifest.json file
Use os.lstat() method instead of os.stat() to properly calculate the
sizes (and not destinations of symlinks, e.g.).
Print five biggest plugins instead of three as sos logs and reports do
stand as one "plugin" in the list, often.
Resolves: #2752
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Catch unhandled EOFError in collector and cleaner.
Update the behaviour in report that redundantly prints
the error message twice.
Resolves: #2751
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the `TMPDIR` env var is set, we should reference it if the user has
not provided `--tmp-dir` by the cmdline or sos.conf.
The order of precedence is now:
1. cmdline use of `--tmp-dir`
2. setting `tmp-dir` in `/etc/sos/sos.conf`
3. the `TMPDIR` environment variable
4. `/var/tmp` as a default
Additionally, we will now check if the filesystem type for our tmpdir is
tmpfs, and if so print a warning to the user about the potential
pitfalls of doing so. This information is now recorded in the manifest
as well.
Closes: #2738
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Curently, -k networking.namespace_pattern=.. is broken as the R.E. test
forgets to add the namespace in case of positive match.
Also ensure both plugopts namespace_pattern and namespaces work
together.
Resolves: #2748
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Enhance the ceph_osd plugin to collect more data specific to OSD nodes.
Related: #1945
Resolves: #2735
Signed-off-by: Nikhil Kshirsagar <nkshirsagar@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was discovered that setting a specific plugin timeout via the `-k
$plugin.timeout` option could influence the timeout setting for other
plugins that are not also having their timeout explicitly set. Fix this
by moving the default plugin opts into `Plugin.__init__()` so that each
plugin is ensured a private copy of these default plugin options.
Additionally, add more timeout data to plugin manifest entries to allow
for better tracking of this setting.
Adds a test case for this scenario.
Closes: #2744
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Wait for shutting down threads of timeouted plugins, to prevent
them in writing to moved auxiliary files like sos_logs/sos.log
Resolves: #2722
Closes: #2746
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
| |
Resolves: #2743
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
`get_container_logs()` is now `add_container_logs()` to align it better
with our more common `add_*` methods for plugin collections.
Additionally, it has been extended to accept either a single string or a
list of strings like the other methods, and plugin authors may now
specify either specific container names or regexes.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Fixes a typo in setting the non-primary node options from the ocp
profile against the sosnode object. Second, fixes a small break in
checksum handling for the manifest discovered during `oc` transport
testing for edge cases.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds explicit setup of a new project to use in the `ocp` cluster and
adds better handling of cluster setup generally, which the `ocp` cluster
is the first to make use of.
Included in this change is a correction to
`Cluster.exec_primary_cmd()`'s use of `get_pty` to now be determined on
if the primary node is the local node or not.
Additionally, based on feedback from the OCP engineering team, by
default restrict node lists to masters.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a new transport for `sos collect` by leveraging a
locally available `oc` binary that has been properly configured for
access to an OCP cluster.
This transport will allow users to use `sos collect` to collect reports
from an OCP cluster without directly connecting to any of the nodes
involved. We do this by using the `oc` binary to first launch a pod on
target node(s) and then exec our discovery commands and eventual `sos
report` command to that pod. This in turn is dependent on a function API
for the `oc` binary to communicate with. In the event that `oc` is not
__locally__ available or is not properly configured, we will fallback to
the current default of using SSH ControlPersist to directly connect to
the nodes. Otherwise, the OCP cluster will attempt to automatically use
this new transport.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a new `--transport` option for users to be able to specify the type
of transport to use when connecting to nodes. The default value of
`auto` will defer to the cluster profile to set the transport type,
which will continue to default to use OpenSSH's ControlPersist feature.
Clusters may override the new `set_transport_type()` method to change
the default transport used.
If `--transport` is anything besides `auto`, then the cluster profile
will not be deferred to when choosing a transport for each remote node.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
| |
Convert2RHEL will now archive old logs to maintain the sake of simplicity, and for that,
we are including the archive directory to be collected as well.
Signed-off-by: Rodolfo Olivieri <rolivier@redhat.com>
|
|
|
|
|
|
|
|
|
| |
This was stopped to be collected in foreman plugin split.
Related: #2730
Resolves: #2731
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
scls_matched property needs to be generated by whole loop execution.
Therefore we can return some SCL was found even after the loop.
Related: #2730
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
| |
"scl enable .." injects the SCL sub-path by itself, we are doing
a redundant step that could even harm.
Related: #2730
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
| |
Signed-off-by: Vikas Goel <vikas.goel@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
iptables -vnxL creates nft 'ip filter' table if it does not exist, hence
we must guard iptables execution by presence of the nft table.
An equivalent logic applies to ip6tables.
Resolves: #2724
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If iptables are not realy in use, calling iptables -t <table>
would load corresponding nft table.
Therefore, call iptables -t only for the tables from "nft list ruleset"
output.
Example: nft list ruleset contains
table ip mangle {
..
}
so we can collect iptable -t mangle -nvL .
The same applies to ip6tables as well.
Resolves: #2724
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
| |
Also, remove obsolete parameters of the log_skipped_cmd method.
Related: #2724
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check if the dir is also symlink before performing rmtree()
method so that unlink() method can be used instead.
Traceback (most recent call last):
File "./bin/sos", line 22, in <module>
sos.execute()
File "/tmp/sos/sos/__init__.py", line 186, in execute
self._component.execute()
OSError: Cannot call rmtree on a symbolic link
Closes: #2727
Signed-off-by: Eric Desrochers <eric.desrochers@canonical.com>
|
|
|
|
|
|
|
|
| |
Updates plugins to use the new `self.path_join()` wrapper for
`os.path.join()` so that these plugins now account for non-/ sysroots
for their collections.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
| |
Adds a wrapper for `os.path.join()` which accounts for non-/ sysroots,
like we have done previously for other `os.path` methods. Further
updates `Plugin()` to use this wrapper where appropriate.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
As an interim stopgap measure, increase the timeout for the stagetwo
`logs` test to allow for more time for handling random data generation
and logging, until we're able to define a better/more efficient way to
generate this data within the test suite.
Related: #2700
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
The debug level messages gated by `-v` are very helpful for diagnosing
test failures, but currently not all tests specify the use of verbosity.
Make use of verobsity a default parameter for all test runs to address
this.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Currently, we estimate just plugins' disk space and ignore sos_logs
or sos_reports directories - although they can occupy nontrivial disk
space as well.
Resolves: #2723
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
| |
Closes: #2720
Signed-off-by: Ponnuvel Palaniyappan <pponnuvel@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replicas of ovs-vswitchd and ovsdb-server can be recreated offline
using flow, group, and tlv dumps, and ovs conf.db. This allows for
offline anaylsis and the use of tools such as ovs-appctl
ofproto/trace and ovs-ofctl for debugging.
This patch ensures this information is available in the sos report.
The db is copied rather than collected using ovsdb-client list dump
for two reasons:
ovsdb-client requires interacting with the ovsdb-server which could
take it 'down' for some time, and impact large, busy clusters.
The list-dump is not in a format that can be used to restore the db
offline. All of the information in the list dump is available and more
by copying the db.
Signed-off-by: Salvatore Daniele <sdaniele@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ovs-vsctl list bridge can return an empty 'protocol' column even when
there are OpenFlow protocols in place by default.
ovs-ofctl --version will return the range of supported ofp and should
also be used to ensure flow information for relevant protocol versions
is collected.
OpenFlow default versions:
https://docs.openvswitch.org/en/latest/faq/openflow/
Signed-off-by: Salvatore Daniele <sdaniele@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`nose` is no longer maintained, and as of python-3.10 is functionally
broken. As such, instead transition to running those tests via avocado,
like we do with our integration test suite.
The tests themselves do not need much modification, however due to the
isolation provided for executing the tests we do need to explicitly set
a new PYTHONPATH env var for those executions. This means we still need
to run the unit tests as a separate step from the stageone tests.
The changes needed are mostly around file paths relative to the pwd
where the tests are executed from originally.
Additionally, remove the sosreport_pexpect unit test as it is no longer
useful in its own right, would need more significant changes to run
properly with avocado, and the integration test suite provides better
coverage for what it was testing.
Closes: #2716
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using sub-functions[1] gathering the devlink port attributes does
provide value when debugging. If there is no devlink port, the output is
empty.
Example:
pci/0000:04:00.0/65535: type eth netdev enp4s0f0 flavour physical port 0 splittable false
pci/0000:04:00.0/32768: type eth netdev en4f0pf0sf42 flavour pcisf controller 0 pfnum 0 sfnum 42 splittable false
function:
hw_addr 00:00:00:00:88:88 state active opstate attached
pci/0000:04:00.0/32769: type eth netdev en4f0pf0sf1 flavour pcisf controller 0 pfnum 0 sfnum 1 splittable false
function:
hw_addr 00:00:00:00:00:00 state active opstate attached
auxiliary/mlx5_core.sf.4/131072: type eth netdev enp4s0f0s42 flavour virtual port 0 splittable false
auxiliary/mlx5_core.sf.5/196608: type eth netdev enp4s0f0s1 flavour virtual port 0 splittable false
[1] https://www.kernel.org/doc/html/latest/networking/devlink/devlink-port.html#subfunction
Signed-off-by: Antoine Tenart <atenart@kernel.org>
|
|
|
|
|
|
|
|
|
|
| |
During a dry run, add_journal method sets pred=None whilst log_skipped_cmd
refers to predicate attributes. In that case, replace None predicate
by a default / empty predicate.
Resolves: #2711
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding a plugin for Google Cloud Compute Engine VMs.
The plugin will collect data about Google services
running on the system (journalctl -u google*), data from
the Metadata Server that's available for every instance
and output of `gcloud auth list` - if available.
Available option:
keep-pii - if set, the plugin won't remove the
project name and project number from the
metadata.json file.
Closes #2699
Signed-off-by: Maciej Strzelczyk <strzelczyk@google.com>
|
|
|
|
|
|
|
|
|
|
| |
Collect 'insights-client --test-connection --net-debug' cmdout
with a limited timeout to prevent plugin stuck for too long in case
of a networking/proxy issue.
Resolves: #2704
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Collect foreman-puma-status and 'pumactl [gc-|]stats', optionally using
SCL (if detected).
Resolves: #2712
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When prompting for a case id, `Policy` was not properly updating the
option value, only assigning the value to `Policy` which meant that
aspects outside of `Policy` could not always properly reference the
(updated) case id.
Fix this by assigning the case id prompt response back to the case_id
option value. `Policy` still retains a local reference to case_id as
existing logic was setting that based on the (assumed-to-be-updated)
option value, which until this commit would have been superfluous.
Closes: #2707
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If iptables has built-in nf_tables kmod, then
'ip netns <foo> iptables-save' command requires the kmod which must
be guarded by predicate.
Analogously for ip6tables.
Resolves: #2703
Signed-off-by: Pavel Moravec <pmoravec@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit resolves a situation in which `sos` is being run in a
container but the `SystemdInit` InitSystem would not properly load
information from the host, thus causing the `Plugin.is_service*()`
methods to erroneously fail or return `False`.
Fix this scenario by pulling the `_container_init()` and related logic
to check for a containerized host sysroot out of the Red Hat specific
policy and into the base `LinuxPolicy` class so that the init system can
be initialized with the correct sysroot, which is now used to chroot the
calls to the relevant `systemctl` commands.
For now, this does impose the use of looking for the `container` env var
(automatically set by docker, podman, and crio regardless of
distribution) and the use of the `HOST` env var to read where the host's
`/` filesystem is mounted within the container. If desired in the
future, this can be changed to allow policy-specific overrides. For now
however, this extends host collection via an sos container for all
distributions currently shipping sos.
Note that this issue only affected the `InitSystem` abstraction for
loading information about local services, and did not affect init system
related commands called by plugins as part of those collections.
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since its addition to sos, collect has assumed the use of a system
installation of SSH in order to connect to the nodes identified for
collection. However, there may be use cases and desires to use other
transport protocols.
As such, provide an abstraction for these protocols in the form of the
new `RemoteTransport` class that `SoSNode` will now leverage. So far an
abstraction for the currently used SSH ControlPersist function is
provided, along with a psuedo abstraction for local execution so that
SoSNode does not directly need to make more "if local then foo" checks
than are absolutely necessary.
Related: #2668
Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removed the diagnostics part as it is no longer maintained and doesn't
work on Openshift.
Adding additional projects to collect.
Removed getting all namespaces as it is not needed for troubleshooting
and project names are sensitive for some customers.
Adding condition to collect the logs from systemd openshift services if
not running as static pods.
Signed-off-by: Vladislav Walek <22072258+vwalek@users.noreply.github.com>
|