Skip to content

Commit

Permalink
Store metadata separately in rlib files
Browse files Browse the repository at this point in the history
Right now whenever an rlib file is linked against, all of the metadata from the
rlib is pulled in to the final staticlib or binary. The reason for this is that
the metadata is currently stored in a section of the object file. Note that this
is intentional for dynamic libraries in order to distribute metadata bundled
with static libraries.

This commit alters the situation for rlib libraries to instead store the
metadata in a separate file in the archive. In doing so, when the archive is
passed to the linker, none of the metadata will get pulled into the result
executable. Furthermore, the metadata file is skipped when assembling rlibs into
an archive.

The snag in this implementation comes with multiple output formats. When
generating a dylib, the metadata needs to be in the object file, but when
generating an rlib this needs to be separate. In order to accomplish this, the
metadata variable is inserted into an entirely separate LLVM Module which is
then codegen'd into a different location (foo.metadata.o). This is then linked
into dynamic libraries and silently ignored for rlib files.

While changing how metadata is inserted into archives, I have also stopped
compressing metadata when inserted into rlib files. We have wanted to stop
compressing metadata, but the sections it creates in object file sections are
apparently too large. Thankfully if it's just an arbitrary file it doesn't
matter how large it is.

I have seen massive reductions in executable sizes, as well as staticlib output
sizes (to confirm that this is all working).
  • Loading branch information
alexcrichton committed Dec 4, 2013
1 parent 693ec73 commit 7c8bf23
Show file tree
Hide file tree
Showing 7 changed files with 156 additions and 78 deletions.
14 changes: 11 additions & 3 deletions src/librustc/back/archive.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ use std::str;
use extra::tempfile::TempDir;
use syntax::abi;

pub static METADATA_FILENAME: &'static str = "metadata";

pub struct Archive {
priv sess: Session,
priv dst: Path,
Expand Down Expand Up @@ -81,17 +83,22 @@ impl Archive {
/// search in the relevant locations for a library named `name`.
pub fn add_native_library(&mut self, name: &str) {
let location = self.find_library(name);
self.add_archive(&location, name);
self.add_archive(&location, name, []);
}

/// Adds all of the contents of the rlib at the specified path to this
/// archive.
pub fn add_rlib(&mut self, rlib: &Path) {
let name = rlib.filename_str().unwrap().split('-').next().unwrap();
self.add_archive(rlib, name);
self.add_archive(rlib, name, [METADATA_FILENAME]);
}

/// Adds an arbitrary file to this archive
pub fn add_file(&mut self, file: &Path) {
run_ar(self.sess, "r", None, [&self.dst, file]);
}

fn add_archive(&mut self, archive: &Path, name: &str) {
fn add_archive(&mut self, archive: &Path, name: &str, skip: &[&str]) {
let loc = TempDir::new("rsar").unwrap();

// First, extract the contents of the archive to a temporary directory
Expand All @@ -106,6 +113,7 @@ impl Archive {
let mut inputs = ~[];
for file in files.iter() {
let filename = file.filename_str().unwrap();
if skip.iter().any(|s| *s == filename) { continue }
let filename = format!("r-{}-{}", name, filename);
let new_filename = file.with_filename(filename);
fs::rename(file, &new_filename);
Expand Down
123 changes: 96 additions & 27 deletions src/librustc/back/link.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,9 @@
// except according to those terms.


use back::archive::Archive;
use back::archive::{Archive, METADATA_FILENAME};
use back::rpath;
use driver::driver::CrateTranslation;
use driver::session::Session;
use driver::session;
use lib::llvm::llvm;
Expand Down Expand Up @@ -191,10 +192,11 @@ pub mod write {
use back::link::{output_type_assembly, output_type_bitcode};
use back::link::{output_type_exe, output_type_llvm_assembly};
use back::link::{output_type_object};
use driver::driver::CrateTranslation;
use driver::session::Session;
use driver::session;
use lib::llvm::llvm;
use lib::llvm::{ModuleRef, ContextRef};
use lib::llvm::ModuleRef;
use lib;

use std::c_str::ToCStr;
Expand All @@ -204,10 +206,11 @@ pub mod write {
use std::str;

pub fn run_passes(sess: Session,
llcx: ContextRef,
llmod: ModuleRef,
trans: &CrateTranslation,
output_type: output_type,
output: &Path) {
let llmod = trans.module;
let llcx = trans.context;
unsafe {
llvm::LLVMInitializePasses();

Expand Down Expand Up @@ -313,12 +316,23 @@ pub mod write {
// context, so don't dispose
jit::exec(sess, llcx, llmod, true);
} else {
// Create a codegen-specific pass manager to emit the actual
// assembly or object files. This may not end up getting used,
// but we make it anyway for good measure.
let cpm = llvm::LLVMCreatePassManager();
llvm::LLVMRustAddAnalysisPasses(tm, cpm, llmod);
llvm::LLVMRustAddLibraryInfo(cpm, llmod);
// A codegen-specific pass manager is used to generate object
// files for an LLVM module.
//
// Apparently each of these pass managers is a one-shot kind of
// thing, so we create a new one for each type of output. The
// pass manager passed to the closure should be ensured to not
// escape the closure itself, and the manager should only be
// used once.
fn with_codegen(tm: TargetMachineRef, llmod: ModuleRef,
f: |PassManagerRef|) {
let cpm = llvm::LLVMCreatePassManager();
llvm::LLVMRustAddAnalysisPasses(tm, cpm, llmod);
llvm::LLVMRustAddLibraryInfo(cpm, llmod);
f(cpm);
llvm::LLVMDisposePassManager(cpm);

}

match output_type {
output_type_none => {}
Expand All @@ -329,21 +343,48 @@ pub mod write {
}
output_type_llvm_assembly => {
output.with_c_str(|output| {
llvm::LLVMRustPrintModule(cpm, llmod, output)
with_codegen(tm, llmod, |cpm| {
llvm::LLVMRustPrintModule(cpm, llmod, output);
})
})
}
output_type_assembly => {
WriteOutputFile(sess, tm, cpm, llmod, output, lib::llvm::AssemblyFile);
with_codegen(tm, llmod, |cpm| {
WriteOutputFile(sess, tm, cpm, llmod, output,
lib::llvm::AssemblyFile);
});

// windows will invoke this function with an assembly
// output type when it's actually generating an object
// file. This is because g++ is used to compile the
// assembly instead of having LLVM directly output an
// object file. Regardless, in this case, we're going to
// possibly need a metadata file.
if sess.opts.output_type != output_type_assembly {
with_codegen(tm, trans.metadata_module, |cpm| {
let out = output.with_extension("metadata.o");
WriteOutputFile(sess, tm, cpm,
trans.metadata_module, &out,
lib::llvm::ObjectFile);
})
}
}
output_type_exe | output_type_object => {
WriteOutputFile(sess, tm, cpm, llmod, output, lib::llvm::ObjectFile);
with_codegen(tm, llmod, |cpm| {
WriteOutputFile(sess, tm, cpm, llmod, output,
lib::llvm::ObjectFile);
});
with_codegen(tm, trans.metadata_module, |cpm| {
WriteOutputFile(sess, tm, cpm, trans.metadata_module,
&output.with_extension("metadata.o"),
lib::llvm::ObjectFile);
})
}
}

llvm::LLVMDisposePassManager(cpm);
}

llvm::LLVMRustDisposeTargetMachine(tm);
llvm::LLVMDisposeModule(trans.metadata_module);
// the jit takes ownership of these two items
if !sess.opts.jit {
llvm::LLVMDisposeModule(llmod);
Expand Down Expand Up @@ -895,10 +936,9 @@ pub fn get_cc_prog(sess: Session) -> ~str {
/// Perform the linkage portion of the compilation phase. This will generate all
/// of the requested outputs for this compilation session.
pub fn link_binary(sess: Session,
crate_types: &[~str],
trans: &CrateTranslation,
obj_filename: &Path,
out_filename: &Path,
lm: LinkMeta) {
out_filename: &Path) {
let outputs = if sess.opts.test {
// If we're generating a test executable, then ignore all other output
// styles at all other locations
Expand All @@ -908,7 +948,7 @@ pub fn link_binary(sess: Session,
// look at what was in the crate file itself for generating output
// formats.
let mut outputs = sess.opts.outputs.clone();
for ty in crate_types.iter() {
for ty in trans.crate_types.iter() {
if "bin" == *ty {
outputs.push(session::OutputExecutable);
} else if "dylib" == *ty || "lib" == *ty {
Expand All @@ -926,12 +966,13 @@ pub fn link_binary(sess: Session,
};

for output in outputs.move_iter() {
link_binary_output(sess, output, obj_filename, out_filename, lm);
link_binary_output(sess, trans, output, obj_filename, out_filename);
}

// Remove the temporary object file if we aren't saving temps
// Remove the temporary object file and metadata if we aren't saving temps
if !sess.opts.save_temps {
fs::unlink(obj_filename);
fs::unlink(&obj_filename.with_extension("metadata.o"));
}
}

Expand All @@ -945,11 +986,11 @@ fn is_writeable(p: &Path) -> bool {
}

fn link_binary_output(sess: Session,
trans: &CrateTranslation,
output: session::OutputStyle,
obj_filename: &Path,
out_filename: &Path,
lm: LinkMeta) {
let libname = output_lib_filename(lm);
out_filename: &Path) {
let libname = output_lib_filename(trans.link);
let out_filename = match output {
session::OutputRlib => {
out_filename.with_filename(format!("lib{}.rlib", libname))
Expand Down Expand Up @@ -987,7 +1028,7 @@ fn link_binary_output(sess: Session,

match output {
session::OutputRlib => {
link_rlib(sess, obj_filename, &out_filename);
link_rlib(sess, Some(trans), obj_filename, &out_filename);
}
session::OutputStaticlib => {
link_staticlib(sess, obj_filename, &out_filename);
Expand All @@ -1007,9 +1048,25 @@ fn link_binary_output(sess: Session,
// rlib primarily contains the object file of the crate, but it also contains
// all of the object files from native libraries. This is done by unzipping
// native libraries and inserting all of the contents into this archive.
fn link_rlib(sess: Session, obj_filename: &Path,
//
// Instead of putting the metadata in an object file section, instead rlibs
// contain the metadata in a separate file.
fn link_rlib(sess: Session,
trans: Option<&CrateTranslation>, // None == no metadata
obj_filename: &Path,
out_filename: &Path) -> Archive {
let mut a = Archive::create(sess, out_filename, obj_filename);

match trans {
Some(trans) => {
let metadata = obj_filename.with_filename(METADATA_FILENAME);
fs::File::create(&metadata).write(trans.metadata);
a.add_file(&metadata);
fs::unlink(&metadata);
}
None => {}
}

for &(ref l, kind) in cstore::get_used_libraries(sess.cstore).iter() {
match kind {
cstore::NativeStatic => {
Expand All @@ -1029,8 +1086,12 @@ fn link_rlib(sess: Session, obj_filename: &Path,
//
// Additionally, there's no way for us to link dynamic libraries, so we warn
// about all dynamic library dependencies that they're not linked in.
//
// There's no need to include metadata in a static archive, so ensure to not
// link in the metadata object file (and also don't prepare the archive with a
// metadata file).
fn link_staticlib(sess: Session, obj_filename: &Path, out_filename: &Path) {
let mut a = link_rlib(sess, obj_filename, out_filename);
let mut a = link_rlib(sess, None, obj_filename, out_filename);
a.add_native_library("morestack");

let crates = cstore::get_used_crates(sess.cstore, cstore::RequireStatic);
Expand Down Expand Up @@ -1111,6 +1172,14 @@ fn link_args(sess: Session,
~"-o", out_filename.as_str().unwrap().to_owned(),
obj_filename.as_str().unwrap().to_owned()]);

// When linking a dynamic library, we put the metadata into a section of the
// executable. This metadata is in a separate object file from the main
// object file, so we link that in here.
if dylib {
let metadata = obj_filename.with_extension("metadata.o");
args.push(metadata.as_str().unwrap().to_owned());
}

if sess.targ_cfg.os == abi::OsLinux {
// GNU-style linkers will use this to omit linking to libraries which
// don't actually fulfill any relocations, but only for libraries which
Expand Down
13 changes: 6 additions & 7 deletions src/librustc/driver/driver.rs
Original file line number Diff line number Diff line change
Expand Up @@ -331,8 +331,10 @@ pub fn phase_3_run_analysis_passes(sess: Session,
pub struct CrateTranslation {
context: ContextRef,
module: ModuleRef,
metadata_module: ModuleRef,
link: LinkMeta,
crate_types: ~[~str],
metadata: ~[u8],
}

/// Run the translation phase to LLVM, after which the AST and analysis can
Expand Down Expand Up @@ -364,8 +366,7 @@ pub fn phase_5_run_llvm_passes(sess: Session,

time(sess.time_passes(), "LLVM passes", (), |_|
link::write::run_passes(sess,
trans.context,
trans.module,
trans,
output_type,
&asm_filename));

Expand All @@ -378,8 +379,7 @@ pub fn phase_5_run_llvm_passes(sess: Session,
} else {
time(sess.time_passes(), "LLVM passes", (), |_|
link::write::run_passes(sess,
trans.context,
trans.module,
trans,
sess.opts.output_type,
&outputs.obj_filename));
}
Expand All @@ -392,10 +392,9 @@ pub fn phase_6_link_output(sess: Session,
outputs: &OutputFilenames) {
time(sess.time_passes(), "linking", (), |_|
link::link_binary(sess,
trans.crate_types,
trans,
&outputs.obj_filename,
&outputs.out_filename,
trans.link));
&outputs.out_filename));
}

pub fn stop_after_phase_3(sess: Session) -> bool {
Expand Down
14 changes: 6 additions & 8 deletions src/librustc/metadata/encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,14 @@ use middle::ty;
use middle::typeck;
use middle;

use std::cast;
use std::hashmap::{HashMap, HashSet};
use std::io::{Writer, Seek, Decorator};
use std::io::mem::MemWriter;
use std::io::{Writer, Seek, Decorator};
use std::str;
use std::util;
use std::vec;

use extra::flate;
use extra::serialize::Encodable;
use extra;

Expand All @@ -47,8 +48,6 @@ use syntax::parse::token;
use syntax;
use writer = extra::ebml::writer;

use std::cast;

// used by astencode:
type abbrev_map = @mut HashMap<ty::t, tyencode::ty_abbrev>;

Expand Down Expand Up @@ -1887,10 +1886,9 @@ pub fn encode_metadata(parms: EncodeParams, crate: &Crate) -> ~[u8] {
// remaining % 4 bytes.
wr.write(&[0u8, 0u8, 0u8, 0u8]);

let writer_bytes: &mut ~[u8] = wr.inner_mut_ref();

metadata_encoding_version.to_owned() +
flate::deflate_bytes(*writer_bytes)
// This is a horrible thing to do to the outer MemWriter, but thankfully we
// don't use it again so... it's ok right?
return util::replace(wr.inner_mut_ref(), ~[]);
}

// Get the encoded string for a type
Expand Down
Loading

4 comments on commit 7c8bf23

@bors
Copy link
Contributor

@bors bors commented on 7c8bf23 Dec 4, 2013

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saw approval from pcwalton
at alexcrichton@7c8bf23

@bors
Copy link
Contributor

@bors bors commented on 7c8bf23 Dec 4, 2013

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merging alexcrichton/rust/smaller-metadata = 7c8bf23 into auto

@bors
Copy link
Contributor

@bors bors commented on 7c8bf23 Dec 4, 2013

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alexcrichton/rust/smaller-metadata = 7c8bf23 merged ok, testing candidate = 495251bb

@bors
Copy link
Contributor

@bors bors commented on 7c8bf23 Dec 4, 2013

Please sign in to comment.