Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

head: head_backwards for non-seekable files like /proc/* or fifos (named pipes) #5732

Merged
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
185c8bb
implement head_backwards for non-seekable files like /proc/* or pipes
cre4ture Dec 27, 2023
37bf664
try to fix windows CI/CD
cre4ture Dec 27, 2023
ef32af1
disable proc-fs related tests for macos
cre4ture Dec 27, 2023
06c3329
cargo fmt
cre4ture Dec 27, 2023
1190ab8
disable helper function for gnu compare on windows
cre4ture Dec 27, 2023
effc313
disable proc-fs related tests for freebsd
cre4ture Dec 27, 2023
937283b
realized that read_but_last_n_xxx is already there
cre4ture Dec 27, 2023
71a3df1
tmp commit of test with dev null and very largs number
cre4ture Dec 27, 2023
98fa947
remove test again
cre4ture Dec 27, 2023
58c90d1
add tests for head read backwards lines
cre4ture Dec 27, 2023
3fa05c5
cleanup original location from previously move check
cre4ture Dec 27, 2023
1d08a70
simplify tests - avoid comparison with live gnu
cre4ture Dec 27, 2023
d16d8b0
Merge branch 'main' into feature/head_backwards_nonseekable_files
cre4ture Dec 27, 2023
9f51541
inject too small function run_and_check_if_it_outputs_something
cre4ture Dec 27, 2023
4c58756
reduce code duplication
cre4ture Dec 27, 2023
6e310b7
Merge remote-tracking branch 'origin/main' into feature/head_backward…
cre4ture Dec 28, 2023
f279461
function renaming to reflect the purpose
cre4ture Dec 28, 2023
c7ef228
implement gnu-compatibility optimization for small files
cre4ture Dec 29, 2023
e0af0f3
fix wrong testname
cre4ture Dec 29, 2023
2cc6e35
cargo fmt
cre4ture Dec 29, 2023
5d255be
rename variable to improve readability
cre4ture Dec 29, 2023
e90e8bc
windows: blksize is 512
cre4ture Dec 30, 2023
d9c1f4c
cargo fmt
cre4ture Dec 30, 2023
68e7cb2
head: fix 32bit test
cre4ture Dec 30, 2023
d0ebc5a
Merge branch 'main' into feature/head_backwards_nonseekable_files
cre4ture Dec 31, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 97 additions & 63 deletions src/uu/head/src/head.rs
Original file line number Diff line number Diff line change
Expand Up @@ -243,63 +243,82 @@
Ok(())
}

fn read_but_last_n_bytes(input: &mut impl std::io::BufRead, n: usize) -> std::io::Result<()> {
fn checked_convert_to_usize_or_print_error(n: u64) -> Option<usize> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rename this function to explain the goal, not the implementation

like
get_bytes()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is used at two places. Once for bytes and once for lines. So we could call it 'get_number_of_bytes_or_lines'.

But to be honest, I do currently not understand why this should improve the readability. The only reason for this function to be called is that the downcast from u64 to usize might not be possible if we are on a 32 bit platform and the user wants us to print all but last with very large number.

Could you please elaborate your proposal a bit more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I provided a new proposal. Let me know if your good with it.

match usize::try_from(n) {
Ok(value) => Some(value),
Err(e) => {
show!(USimpleError::new(

Check warning on line 250 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L250

Added line #L250 was not covered by tests
1,
format!("{e}: number of bytes is too large")

Check warning on line 252 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L252

Added line #L252 was not covered by tests
));
None
}

Check warning on line 255 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L254-L255

Added lines #L254 - L255 were not covered by tests
}
}

fn read_but_last_n_bytes(input: &mut impl std::io::BufRead, n: u64) -> std::io::Result<()> {
if n == 0 {
//prints everything
return read_n_bytes(input, std::u64::MAX);
}

let stdout = std::io::stdout();
let mut stdout = stdout.lock();
if let Some(n) = checked_convert_to_usize_or_print_error(n) {
let stdout = std::io::stdout();
let mut stdout = stdout.lock();

let mut ring_buffer = vec![0u8; n];
let mut ring_buffer = vec![0u8; n];

// first we fill the ring buffer
if let Err(e) = input.read_exact(&mut ring_buffer) {
if e.kind() == ErrorKind::UnexpectedEof {
return Ok(());
} else {
return Err(e);
}
}
let mut buffer = [0u8; BUF_SIZE];
loop {
let read = loop {
match input.read(&mut buffer) {
Ok(n) => break n,
Err(e) => match e.kind() {
ErrorKind::Interrupted => {}
_ => return Err(e),
},
}
};
if read == 0 {
return Ok(());
} else if read >= n {
stdout.write_all(&ring_buffer)?;
stdout.write_all(&buffer[..read - n])?;
for i in 0..n {
ring_buffer[i] = buffer[read - n + i];
// first we fill the ring buffer
if let Err(e) = input.read_exact(&mut ring_buffer) {
if e.kind() == ErrorKind::UnexpectedEof {
return Ok(());
} else {
return Err(e);

Check warning on line 276 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L276

Added line #L276 was not covered by tests
}
} else {
stdout.write_all(&ring_buffer[..read])?;
for i in 0..n - read {
ring_buffer[i] = ring_buffer[read + i];
}
let mut buffer = [0u8; BUF_SIZE];
loop {
let read = loop {
match input.read(&mut buffer) {
Ok(n) => break n,
Err(e) => match e.kind() {
ErrorKind::Interrupted => {}
_ => return Err(e),
},

Check warning on line 287 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L286-L287

Added lines #L286 - L287 were not covered by tests
}
};
if read == 0 {
return Ok(());
} else if read >= n {
stdout.write_all(&ring_buffer)?;
stdout.write_all(&buffer[..read - n])?;
for i in 0..n {
ring_buffer[i] = buffer[read - n + i];
}
} else {
stdout.write_all(&ring_buffer[..read])?;
for i in 0..n - read {
ring_buffer[i] = ring_buffer[read + i];
}
ring_buffer[n - read..].copy_from_slice(&buffer[..read]);
}
ring_buffer[n - read..].copy_from_slice(&buffer[..read]);
}
}

Ok(())

Check warning on line 308 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L308

Added line #L308 was not covered by tests
}

fn read_but_last_n_lines(
input: impl std::io::BufRead,
n: usize,
n: u64,
separator: u8,
) -> std::io::Result<()> {
let stdout = std::io::stdout();
let mut stdout = stdout.lock();
for bytes in take_all_but(lines(input, separator), n) {
stdout.write_all(&bytes?)?;
if let Some(n) = checked_convert_to_usize_or_print_error(n) {
let stdout = std::io::stdout();
let mut stdout = stdout.lock();
for bytes in take_all_but(lines(input, separator), n) {
stdout.write_all(&bytes?)?;
}
}
Ok(())
}
Expand Down Expand Up @@ -380,7 +399,41 @@
}
}

fn is_seekable(input: &mut std::fs::File) -> bool {
let current_pos = input.stream_position();
current_pos.is_ok()
&& input.seek(SeekFrom::End(0)).is_ok()
&& input.seek(SeekFrom::Start(current_pos.unwrap())).is_ok()
}

fn head_backwards_file(input: &mut std::fs::File, options: &HeadOptions) -> std::io::Result<()> {
let seekable = is_seekable(input);
if seekable {
return head_backwards_on_seekable_file(input, options);
}

head_backwards_on_non_seekable_file(input, options)
}

fn head_backwards_on_non_seekable_file(
input: &mut std::fs::File,
options: &HeadOptions,
) -> std::io::Result<()> {
let reader = &mut std::io::BufReader::with_capacity(BUF_SIZE, &*input);

match options.mode {
Mode::AllButLastBytes(n) => read_but_last_n_bytes(reader, n)?,
Mode::AllButLastLines(n) => read_but_last_n_lines(reader, n, options.line_ending.into())?,
_ => unreachable!(),

Check warning on line 427 in src/uu/head/src/head.rs

View check run for this annotation

Codecov / codecov/patch

src/uu/head/src/head.rs#L427

Added line #L427 was not covered by tests
}

Ok(())
}

fn head_backwards_on_seekable_file(
input: &mut std::fs::File,
options: &HeadOptions,
) -> std::io::Result<()> {
match options.mode {
Mode::AllButLastBytes(n) => {
let size = input.metadata()?.len();
Expand Down Expand Up @@ -434,32 +487,13 @@
let stdin = std::io::stdin();
let mut stdin = stdin.lock();

// Outputting "all-but-last" requires us to use a ring buffer with size n, so n
// must be converted from u64 to usize to fit in memory. If such conversion fails,
// it means the platform doesn't have enough memory to hold the buffer, so we fail.
if let Mode::AllButLastLines(n) | Mode::AllButLastBytes(n) = options.mode {
if let Err(e) = usize::try_from(n) {
show!(USimpleError::new(
1,
format!("{e}: number of bytes is too large")
));
continue;
};
};

match options.mode {
Mode::FirstBytes(n) => read_n_bytes(&mut stdin, n),
// unwrap is guaranteed to succeed because we checked the value of n above
Mode::AllButLastBytes(n) => {
read_but_last_n_bytes(&mut stdin, n.try_into().unwrap())
}
Mode::AllButLastBytes(n) => read_but_last_n_bytes(&mut stdin, n),
Mode::FirstLines(n) => read_n_lines(&mut stdin, n, options.line_ending.into()),
// unwrap is guaranteed to succeed because we checked the value of n above
Mode::AllButLastLines(n) => read_but_last_n_lines(
&mut stdin,
n.try_into().unwrap(),
options.line_ending.into(),
),
Mode::AllButLastLines(n) => {
read_but_last_n_lines(&mut stdin, n, options.line_ending.into())
}
}
}
(name, false) => {
Expand Down
42 changes: 42 additions & 0 deletions tests/by-util/test_head.rs
Original file line number Diff line number Diff line change
Expand Up @@ -378,3 +378,45 @@ fn test_presume_input_pipe_5_chars() {
.run()
.stdout_is_fixture("lorem_ipsum_5_chars.expected");
}

#[cfg(all(
not(target_os = "windows"),
not(target_os = "macos"),
not(target_os = "freebsd")
))]
#[test]
fn test_read_backwards_bytes_proc_fs_version() {
let ts = TestScenario::new(util_name!());

let args = ["-c", "-1", "/proc/version"];
let result = ts.ucmd().args(&args).succeeds();
sylvestre marked this conversation as resolved.
Show resolved Hide resolved
assert!(result.stdout().len() > 0);
}

#[cfg(all(
not(target_os = "windows"),
not(target_os = "macos"),
not(target_os = "freebsd")
))]
#[test]
fn test_read_backwards_bytes_proc_fs_modules() {
let ts = TestScenario::new(util_name!());

let args = ["-c", "-1", "/proc/modules"];
let result = ts.ucmd().args(&args).succeeds();
assert!(result.stdout().len() > 0);
}

#[cfg(all(
not(target_os = "windows"),
not(target_os = "macos"),
not(target_os = "freebsd")
))]
#[test]
fn test_read_backwards_lines_proc_fs_modules() {
let ts = TestScenario::new(util_name!());

let args = ["--lines", "-1", "/proc/modules"];
let result = ts.ucmd().args(&args).succeeds();
assert!(result.stdout().len() > 0);
}
Loading