Skip to content

Commit

Permalink
Merge pull request #6 from sfcompute/kenny/feat_library
Browse files Browse the repository at this point in the history
Add Runtime Dependency Installation for Required System Packages and Library functionality
  • Loading branch information
kennethdsheridan authored Nov 21, 2024
2 parents badcb1a + 085c826 commit cf89f0d
Show file tree
Hide file tree
Showing 7 changed files with 413 additions and 310 deletions.
Binary file added .DS_Store
Binary file not shown.
30 changes: 20 additions & 10 deletions .idea/workspace.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,12 @@ opt-level = 3 # Maximum optimization
lto = true # Enable link-time optimization
codegen-units = 1 # Maximize performance
strip = true # Strip symbols from binary

# Hybrid configuration for library and binary
[lib]
name = "hardware_report" # Optional, defaults to the package name
path = "src/lib.rs" # Path to the library file

[[bin]]
name = "hardware_report_binary" # Name of the binary
path = "src/bin/hardware_report.rs" # Path to the binary's main.rs
40 changes: 38 additions & 2 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,42 @@
# Hardware Report
A Rust utility that automatically collects and reports detailed hardware information from Linux servers, outputting the data in TOML format.

The collected data is saved as `<chassis_serialnumber>_hardware_report.toml`, which ensures scalability by creating distinct reports for each server. These reports are useful for infrastructure standardization across heterogeneous bare-metal hardware, allowing operators to automate and manage configurations consistently.

This tool is designed to help the open-source GPU infrastructure community by providing a uniform method for gathering and serializing system data, which can be particularly beneficial when managing diverse clusters of GPUs and servers with varying configurations.

## Quick Start

### Use as a Binary
To compile the binary for `hardware_report`:

```bash
# Build for your platform
cargo build --release

# The binary will be available at:
target/release/hardware_report
```

### Use as a Library
You can use `hardware_report` as a library in your Rust project. Add the following to your `Cargo.toml`:

```toml
[dependencies]
hardware_report = { path = "../path/to/hardware_report" }
```

Then, in your Rust code:

```rust
use hardware_report::HardwareReport;

fn main() {
let report = HardwareReport::new().expect("Failed to create hardware report");
report.print_summary();
}
```

## ⚠️ IMPORTANT BUILD REQUIREMENT ⚠️
**DOCKER MUST BE RUNNING ON YOUR LOCAL MACHINE TO COMPILE FOR LINUX ON NON-LINUX SYSTEMS**
**WITHOUT DOCKER RUNNING, THE BUILD WILL FAIL WHEN EXECUTING `make linux` ON macOS OR WINDOWS**
Expand Down Expand Up @@ -30,7 +66,7 @@ A Rust utility that automatically collects and reports detailed hardware informa
- Make
- **Docker (REQUIRED for cross-compilation on non-Linux systems)**

### Optional System Utilities
### Required System Utilities
- `nvidia-smi` (required for NVIDIA GPU information)
- `ipmitool` (required for BMC information)
- `ethtool` (required for network interface details)
Expand Down Expand Up @@ -836,7 +872,7 @@ The program handles various error cases gracefully:
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

## Authors
- SF Compute
- Kenny Sheridan, Supercomputing Engineer

## Acknowledgments
- This tool makes use of various Linux system utilities and their output formats
Expand Down
Binary file modified build/release/hardware_report-linux-x86_64
Binary file not shown.
178 changes: 178 additions & 0 deletions src/bin/hardware_report.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
use std::error::Error;
use std::fs;
use std::io;
use std::process::Command;

use hardware_report::ServerInfo;

fn main() -> Result<(), Box<dyn Error>> {
// Collect server information
let server_info = ServerInfo::collect()?;

// Generate summary output for console
println!("System Summary:");
println!("==============");

// Print the system Hostname
println!("Hostname: {}", server_info.hostname);

println!("CPU: {}", server_info.summary.cpu_summary);
println!(
"Total: {} Cores, {} Threads",
server_info.summary.cpu_topology.total_cores,
server_info.summary.cpu_topology.total_threads
);

// Fix memory output format - add the missing format specifier
println!(
"Memory: {} {} @ {}",
server_info.hardware.memory.total,
server_info.hardware.memory.type_,
server_info.hardware.memory.speed
);

println!(
"Storage: {} (Total: {:.2} TB)",
server_info.summary.total_storage, server_info.summary.total_storage_tb
);

// Calculate total storage
let total_storage = server_info
.hardware
.storage
.devices
.iter()
.map(|device| device.size.clone())
.collect::<Vec<String>>()
.join(" + ");
println!("Available Disks: {}", total_storage);

// Get BIOS information from dmidecode
let output = Command::new("dmidecode").args(["-t", "bios"]).output()?;
let bios_str = String::from_utf8(output.stdout)?;
println!(
"BIOS: {} {} ({})",
ServerInfo::extract_dmidecode_value(&bios_str, "Vendor")?,
ServerInfo::extract_dmidecode_value(&bios_str, "Version")?,
ServerInfo::extract_dmidecode_value(&bios_str, "Release Date")?
);

// Get chassis information from dmidecode
let output = Command::new("dmidecode").args(["-t", "chassis"]).output()?;
let chassis_str = String::from_utf8(output.stdout)?;
println!(
"Chassis: {} {} (S/N: {})",
ServerInfo::extract_dmidecode_value(&chassis_str, "Manufacturer")?,
ServerInfo::extract_dmidecode_value(&chassis_str, "Type")?,
ServerInfo::extract_dmidecode_value(&chassis_str, "Serial Number")?
);

// Get motherboard information from server_info
println!(
"Motherboard: {} {} v{} (S/N: {})",
server_info.summary.motherboard.manufacturer,
server_info.summary.motherboard.product_name,
server_info.summary.motherboard.version,
server_info.summary.motherboard.serial
);

println!("\nNetwork Interfaces:");
for nic in &server_info.network.interfaces {
println!(
" {} - {} {} ({}) [Speed: {}] [NUMA: {}]",
nic.name,
nic.vendor,
nic.model,
nic.pci_id,
nic.speed.as_deref().unwrap_or("Unknown"),
nic.numa_node
.map_or("Unknown".to_string(), |n| n.to_string())
);
}

println!("\nGPUs:");
for gpu in &server_info.hardware.gpus.devices {
println!(
" {} - {} ({}) [NUMA: {}]",
gpu.name,
gpu.vendor,
gpu.pci_id,
gpu.numa_node
.map_or("Unknown".to_string(), |n| n.to_string())
);
}

println!("\nNUMA Topology:");
for (node_id, node) in &server_info.summary.numa_topology {
println!(" Node {}:", node_id);
println!(" Memory: {}", node.memory);
println!(" CPUs: {:?}", node.cpus);

if !node.devices.is_empty() {
println!(" Devices:");
for device in &node.devices {
println!(
" {} - {} (PCI ID: {})",
device.type_, device.name, device.pci_id
);
}
}

println!(" Distances:");
let mut distances: Vec<_> = node.distances.iter().collect();
distances.sort_by_key(|&(k, _)| k);
for (to_node, distance) in distances {
println!(" To Node {}: {}", to_node, distance);
}
}

// Get filesystem information
println!("\nFilesystems:");
let output = Command::new("df")
.args(["-h", "--output=source,fstype,size,used,avail,target"])
.output()?;
let fs_str = String::from_utf8(output.stdout)?;
for line in fs_str.lines().skip(1) {
let fields: Vec<&str> = line.split_whitespace().collect();
if fields.len() >= 6 {
println!(
" {} ({}) - {} total, {} used, {} available, mounted on {}",
fields[0], fields[1], fields[2], fields[3], fields[4], fields[5]
);
}
}

// Get chassis serial number ans sanitize it for use as the file_name
let chassis_serial = server_info.summary.chassis.serial.clone();
let safe_filename = sanitize_filename(&chassis_serial);

fn sanitize_filename(filename: &str) -> String {
filename
.chars()
.map(|c| {
if c.is_alphanumeric() || c == '-' {
c
} else {
'_'
}
})
.collect::<String>()
}

println!(
"\nCreating TOML output for system serial number: {}",
safe_filename
);

let output_filename = format!("{}_hardware_report.toml", safe_filename);

// Convert to TOML
let toml_string = toml::to_string_pretty(&server_info)?;

// Write to file
std::fs::write(&output_filename, toml_string)?;

println!("Configuration has been written to {}", output_filename);

Ok(())
}
Loading

0 comments on commit cf89f0d

Please sign in to comment.