Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Infra] Provide troubleshooting information on the host details page #191104

Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ export const InfraMetadataInfoResponseRT = rt.partial({
const InfraMetadataRequiredRT = rt.type({
id: rt.string,
name: rt.string,
hasSystemIntegration: rt.boolean,
features: rt.array(InfraMetadataFeatureRT),
});

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,5 @@ export const INTEGRATIONS = {

export const DOCKER_METRIC_TYPES: DockerContainerMetrics[] = ['cpu', 'memory', 'network', 'disk'];
export const KUBERNETES_METRIC_TYPES: KubernetesContainerMetrics[] = ['cpu', 'memory'];

export const APM_HOST_TROUBLESHOOTING_LINK = 'https://ela.st/host-troubleshooting';
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,12 @@ import { Section } from '../../components/section';
import { ServicesSectionTitle } from './section_titles';
import { HOST_NAME_FIELD } from '../../../../../common/constants';
import { LinkToApmServices } from '../../links';
import { APM_HOST_FILTER_FIELD } from '../../constants';
import { APM_HOST_FILTER_FIELD, APM_HOST_TROUBLESHOOTING_LINK } from '../../constants';
import { LinkToApmService } from '../../links/link_to_apm_service';
import { useKibanaEnvironmentContext } from '../../../../hooks/use_kibana';
import { useRequestObservable } from '../../hooks/use_request_observable';
import { useTabSwitcherContext } from '../../hooks/use_tab_switcher';
import { useMetadataStateContext } from '../../hooks/use_metadata_state';

export const ServicesContent = ({
hostName,
Expand All @@ -33,6 +34,7 @@ export const ServicesContent = ({
const { isServerlessEnv } = useKibanaEnvironmentContext();
const { request$ } = useRequestObservable();
const { isActiveTab } = useTabSwitcherContext();
const { metadata, loading: metadataLoading } = useMetadataStateContext();

const linkProps = useLinkProps({
app: 'home',
Expand Down Expand Up @@ -92,7 +94,7 @@ export const ServicesContent = ({
defaultMessage: 'An error occurred while fetching services.',
})}
</EuiCallOut>
) : isPending(status) ? (
) : isPending(status) || metadataLoading ? (
<EuiLoadingSpinner size="m" />
) : hasServices ? (
<EuiFlexGroup
Expand All @@ -111,15 +113,15 @@ export const ServicesContent = ({
</EuiFlexItem>
))}
</EuiFlexGroup>
) : (
) : metadata?.hasSystemIntegration ? (
<p>
<FormattedMessage
id="xpack.infra.assetDetails.services.noServicesMsg"
defaultMessage="No services found on this host. Click {apmTutorialLink} to instrument your services with APM."
values={{
apmTutorialLink: (
<EuiLink
data-test-subj="assetDetailsTooltiAPMTutorialLink"
data-test-subj="assetDetailsTooltipAPMTutorialLink"
href={isServerlessEnv ? serverlessLinkProps.href : linkProps.href}
>
<FormattedMessage
Expand All @@ -129,7 +131,34 @@ export const ServicesContent = ({
</EuiLink>
),
}}
/>
/>{' '}
<EuiLink
data-test-subj="assetDetailsAPMTroubleshootingLink"
href={APM_HOST_TROUBLESHOOTING_LINK}
target="_blank"
>
<FormattedMessage
id="xpack.infra.assetDetails.table.services.noServices.troubleshootingLink"
defaultMessage="Troubleshooting"
/>
</EuiLink>
</p>
) : (
<p>
<FormattedMessage
id="xpack.infra.assetDetails.services.noServicesWithApmMessage"
defaultMessage="No services found on this host."
/>{' '}
<EuiLink
data-test-subj="assetDetailsAPMHostTroubleshootingLink"
href={APM_HOST_TROUBLESHOOTING_LINK}
target="_blank"
>
<FormattedMessage
id="xpack.infra.assetDetails.table.services.noServices.troubleshootingLink"
defaultMessage="Troubleshooting"
/>
</EuiLink>
</p>
)}
</Section>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import { findInventoryModel } from '@kbn/metrics-data-access-plugin/common';
import { EuiToolTip } from '@elastic/eui';
import { EuiBadge } from '@elastic/eui';
import { FormattedMessage } from '@kbn/i18n-react';
import { APM_HOST_TROUBLESHOOTING_LINK } from '../../../../components/asset_details/constants';
import { Popover } from '../../../../components/asset_details/tabs/common/popover';
import { HOST_NAME_FIELD } from '../../../../../common/constants';
import { useKibanaContextForPlugin } from '../../../../hooks/use_kibana';
Expand Down Expand Up @@ -318,7 +319,7 @@ export const useHostsTable = () => {
<p>
<EuiLink
data-test-subj="hostsView-tableRow-hasSystemMetrics-learnMoreLink"
href="https://ela.st/host-troubleshooting"
href={APM_HOST_TROUBLESHOOTING_LINK}
target="_blank"
>
<FormattedMessage
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ export const initMetadataRoute = (libs: InfraBackendLibs) => {
timeRange
);
const metricFeatures = pickFeatureName(metricsMetadata.buckets).map(nameToFeature('metrics'));
const hasSystemIntegration = metricsMetadata.hasSystemIntegration;

const info = await getNodeInfo(
framework,
Expand Down Expand Up @@ -82,6 +83,7 @@ export const initMetadataRoute = (libs: InfraBackendLibs) => {
body: InfraMetadataRT.encode({
id,
name,
hasSystemIntegration,
features: [...metricFeatures, ...cloudMetricsFeatures],
info: {
...info,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,14 @@ import {
} from '../../../lib/adapters/framework';
import { KibanaFramework } from '../../../lib/adapters/framework/kibana_framework_adapter';
import { InfraSourceConfiguration } from '../../../lib/sources';
import { TIMESTAMP_FIELD } from '../../../../common/constants';
import { HOST_NAME_FIELD, SYSTEM_INTEGRATION, TIMESTAMP_FIELD } from '../../../../common/constants';
import { getFilterByIntegration } from '../../infra/lib/helpers/query';

export interface InfraMetricsAdapterResponse {
id: string;
name?: string;
buckets: InfraMetadataAggregationBucket[];
hasSystemIntegration: boolean;
}

export const getMetricMetadata = async (
Expand Down Expand Up @@ -70,6 +72,20 @@ export const getMetricMetadata = async (
size: 1000,
},
},
monitoredHost: {
filter: getFilterByIntegration(SYSTEM_INTEGRATION),
aggs: {
name: {
terms: {
field: HOST_NAME_FIELD,
size: 1,
order: {
_key: 'asc',
},
},
},
},
},
Copy link
Contributor

@crespocarlos crespocarlos Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This endpoint is used for other asset types. We might want to run this agg only if nodeType === 'host'.

Also, perhaps running a query like getHasDataFromSystemIntegration parallel to this query might be faster than this term agg because it early terminates as soon as it finds a document that satisfies the criteria.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the aggregation conditional so it will be set only if we have nodeType === 'host' In this query we already filter by node name so it shouldn't be slow I think 🤔 Wdyt? I can also write an extra query again filtered by node name and use it as a boolean flag (if any data is returned or not) but I am not sure if there is a big benefit in this case as you need to query and filter again to get the results as the aggregation is added to an existing query.

Copy link
Contributor

@crespocarlos crespocarlos Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Thanks for adding the condition.

The fact that the query filters by host.name helps, but elasticsearch still needs to collect documents to build the aggregation. getHasDataFromSystemIntegration requires less computation - this function could even be reused.

Another potential downside is if we need to do something similar for other asset types. I'm not sure if adding aggregations to this query will scale nicely.

But it's up to you. We can always revisit this in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Yeah, when I think about it it is a good idea - it brings dependency to the infraMetricsClient but I guess that is fine as we can use it in case we have similar queries for different assets.

},
},
};
Expand All @@ -79,17 +95,25 @@ export const getMetricMetadata = async (
{
metrics?: InfraMetadataAggregationResponse;
nodeName?: InfraMetadataAggregationResponse;
monitoredHost?: { name: InfraMetadataAggregationResponse };
}
>(requestContext, 'search', metricQuery);

const buckets =
response.aggregations && response.aggregations.metrics
? response.aggregations.metrics.buckets
: [];
const hostWithSystemIntegration =
response.aggregations && (response.aggregations?.monitoredHost?.name?.buckets ?? []).length > 0
? response.aggregations?.monitoredHost?.name.buckets[0]?.key
: null;

const hasSystemIntegration = hostWithSystemIntegration === nodeId;
Copy link
Contributor

@crespocarlos crespocarlos Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be done only when nodeType === 'host'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍


return {
id: nodeId,
name: get(response, ['aggregations', 'nodeName', 'buckets', 0, 'key'], nodeId),
hasSystemIntegration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute kind of makes it harder to extend the same idea to other asset types because this condition is specific to hosts.

For now, I think it's OK because we don't know if the problem we're solving here will happen to other asset types, but I would return this attribute only when nodeType === host. wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed that to be returned only for hosts, thanks for catching that!

buckets,
};
};