Skip to content

Commit

Permalink
Rollup merge of rust-lang#68941 - Aaron1011:fix/imported-span, r=petr…
Browse files Browse the repository at this point in the history
…ochenkov

Properly handle Spans that reference imported SourceFiles

Previously, metadata encoding used DUMMY_SP to represent any spans that
referenced an 'imported' SourceFile - e.g. a SourceFile from an upstream
dependency. This currently has no visible consequences, since these
kinds of spans don't currently seem to be emitted anywhere. However,
there's no reason that we couldn't start using such spans in
diagnostics.

This PR changes how we encode and decode spans in crate metadata. We
encode spans in one of two ways:

* 'Local' spans, which reference non-imported SourceFiles, are encoded
  exactly as before.
* 'Foreign' spans, which reference imported SourceFiles, are encoded
  with the CrateNum of their 'originating' crate. Additionally, their
'lo' and 'high' values are rebased on top of the 'originating' crate,
which allows them to be used with the SourceMap data encoded for that
crate.

To support this change, I've also made the following modifications:

* `DefId` and related structs are now moved to `rustc_span`. This allows
  us to use a `CrateNum` inside `SourceFile`. `CrateNum` has special
handling during deserialization (it gets remapped to be the proper
`CrateNum` from the point of view of the current compilation session),
so using a `CrateNum` instead of a plain integer 'workaround type' helps
to simplify deserialization.
* The `ExternalSource` enum is renamed to `ExternalSourceKind`. There is
now a struct called `ExternalSource`, which holds an
`ExternalSourceKind` along with the original line number information for
the file. This is used during `Span` serialization to rebase spans onto
their 'owning' crate.
  • Loading branch information
Manishearth authored Mar 17, 2020
2 parents 9825d5c + 76c7144 commit af55daf
Show file tree
Hide file tree
Showing 11 changed files with 233 additions and 50 deletions.
4 changes: 2 additions & 2 deletions src/librustc/hir/map/collector.rs
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ use rustc_data_structures::stable_hasher::{HashStable, StableHasher};
use rustc_data_structures::svh::Svh;
use rustc_hir as hir;
use rustc_hir::def_id::CRATE_DEF_INDEX;
use rustc_hir::def_id::{CrateNum, DefIndex, LOCAL_CRATE};
use rustc_hir::def_id::{DefIndex, LOCAL_CRATE};
use rustc_hir::intravisit::{self, NestedVisitorMap, Visitor};
use rustc_hir::*;
use rustc_index::vec::{Idx, IndexVec};
Expand Down Expand Up @@ -175,7 +175,7 @@ impl<'a, 'hir> NodeCollector<'a, 'hir> {
.source_map
.files()
.iter()
.filter(|source_file| CrateNum::from_u32(source_file.crate_of_origin) == LOCAL_CRATE)
.filter(|source_file| source_file.cnum == LOCAL_CRATE)
.map(|source_file| source_file.name_hash)
.collect();

Expand Down
8 changes: 3 additions & 5 deletions src/librustc/ich/impls_syntax.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ use crate::ich::StableHashingContext;

use rustc_ast::ast;
use rustc_data_structures::stable_hasher::{HashStable, StableHasher};
use rustc_hir::def_id::{CrateNum, DefId, CRATE_DEF_INDEX};
use rustc_span::SourceFile;

use smallvec::SmallVec;
Expand Down Expand Up @@ -59,7 +58,7 @@ impl<'a> HashStable<StableHashingContext<'a>> for SourceFile {
name_hash,
name_was_remapped,
unmapped_path: _,
crate_of_origin,
cnum,
// Do not hash the source as it is not encoded
src: _,
src_hash,
Expand All @@ -75,9 +74,6 @@ impl<'a> HashStable<StableHashingContext<'a>> for SourceFile {
(name_hash as u64).hash_stable(hcx, hasher);
name_was_remapped.hash_stable(hcx, hasher);

DefId { krate: CrateNum::from_u32(crate_of_origin), index: CRATE_DEF_INDEX }
.hash_stable(hcx, hasher);

src_hash.hash_stable(hcx, hasher);

// We only hash the relative position within this source_file
Expand All @@ -101,6 +97,8 @@ impl<'a> HashStable<StableHashingContext<'a>> for SourceFile {
for &char_pos in normalized_pos.iter() {
stable_normalized_pos(char_pos, start_pos).hash_stable(hcx, hasher);
}

cnum.hash_stable(hcx, hasher);
}
}

Expand Down
91 changes: 85 additions & 6 deletions src/librustc_metadata/rmeta/decoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ impl<'a, 'tcx> SpecializedDecoder<Span> for DecodeContext<'a, 'tcx> {
return Ok(DUMMY_SP);
}

debug_assert_eq!(tag, TAG_VALID_SPAN);
debug_assert!(tag == TAG_VALID_SPAN_LOCAL || tag == TAG_VALID_SPAN_FOREIGN);

let lo = BytePos::decode(self)?;
let len = BytePos::decode(self)?;
Expand All @@ -399,7 +399,68 @@ impl<'a, 'tcx> SpecializedDecoder<Span> for DecodeContext<'a, 'tcx> {
bug!("Cannot decode Span without Session.")
};

let imported_source_files = self.cdata().imported_source_files(&sess.source_map());
// There are two possibilities here:
// 1. This is a 'local span', which is located inside a `SourceFile`
// that came from this crate. In this case, we use the source map data
// encoded in this crate. This branch should be taken nearly all of the time.
// 2. This is a 'foreign span', which is located inside a `SourceFile`
// that came from a *different* crate (some crate upstream of the one
// whose metadata we're looking at). For example, consider this dependency graph:
//
// A -> B -> C
//
// Suppose that we're currently compiling crate A, and start deserializing
// metadata from crate B. When we deserialize a Span from crate B's metadata,
// there are two posibilites:
//
// 1. The span references a file from crate B. This makes it a 'local' span,
// which means that we can use crate B's serialized source map information.
// 2. The span references a file from crate C. This makes it a 'foreign' span,
// which means we need to use Crate *C* (not crate B) to determine the source
// map information. We only record source map information for a file in the
// crate that 'owns' it, so deserializing a Span may require us to look at
// a transitive dependency.
//
// When we encode a foreign span, we adjust its 'lo' and 'high' values
// to be based on the *foreign* crate (e.g. crate C), not the crate
// we are writing metadata for (e.g. crate B). This allows us to
// treat the 'local' and 'foreign' cases almost identically during deserialization:
// we can call `imported_source_files` for the proper crate, and binary search
// through the returned slice using our span.
let imported_source_files = if tag == TAG_VALID_SPAN_LOCAL {
self.cdata().imported_source_files(sess.source_map())
} else {
// FIXME: We don't decode dependencies of proc-macros.
// Remove this once #69976 is merged
if self.cdata().root.is_proc_macro_crate() {
debug!(
"SpecializedDecoder<Span>::specialized_decode: skipping span for proc-macro crate {:?}",
self.cdata().cnum
);
// Decode `CrateNum` as u32 - using `CrateNum::decode` will ICE
// since we don't have `cnum_map` populated.
// This advances the decoder position so that we can continue
// to read metadata.
let _ = u32::decode(self)?;
return Ok(DUMMY_SP);
}
// tag is TAG_VALID_SPAN_FOREIGN, checked by `debug_assert` above
let cnum = CrateNum::decode(self)?;
debug!(
"SpecializedDecoder<Span>::specialized_decode: loading source files from cnum {:?}",
cnum
);

// Decoding 'foreign' spans should be rare enough that it's
// not worth it to maintain a per-CrateNum cache for `last_source_file_index`.
// We just set it to 0, to ensure that we don't try to access something out
// of bounds for our initial 'guess'
self.last_source_file_index = 0;

let foreign_data = self.cdata().cstore.get_crate_data(cnum);
foreign_data.imported_source_files(sess.source_map())
};

let source_file = {
// Optimize for the case that most spans within a translated item
// originate from the same source_file.
Expand All @@ -413,16 +474,32 @@ impl<'a, 'tcx> SpecializedDecoder<Span> for DecodeContext<'a, 'tcx> {
.binary_search_by_key(&lo, |source_file| source_file.original_start_pos)
.unwrap_or_else(|index| index - 1);

self.last_source_file_index = index;
// Don't try to cache the index for foreign spans,
// as this would require a map from CrateNums to indices
if tag == TAG_VALID_SPAN_LOCAL {
self.last_source_file_index = index;
}
&imported_source_files[index]
}
};

// Make sure our binary search above is correct.
debug_assert!(lo >= source_file.original_start_pos && lo <= source_file.original_end_pos);
debug_assert!(
lo >= source_file.original_start_pos && lo <= source_file.original_end_pos,
"Bad binary search: lo={:?} source_file.original_start_pos={:?} source_file.original_end_pos={:?}",
lo,
source_file.original_start_pos,
source_file.original_end_pos
);

// Make sure we correctly filtered out invalid spans during encoding
debug_assert!(hi >= source_file.original_start_pos && hi <= source_file.original_end_pos);
debug_assert!(
hi >= source_file.original_start_pos && hi <= source_file.original_end_pos,
"Bad binary search: hi={:?} source_file.original_start_pos={:?} source_file.original_end_pos={:?}",
hi,
source_file.original_start_pos,
source_file.original_end_pos
);

let lo =
(lo + source_file.translated_source_file.start_pos) - source_file.original_start_pos;
Expand Down Expand Up @@ -1426,14 +1503,16 @@ impl<'a, 'tcx> CrateMetadataRef<'a> {
let local_version = local_source_map.new_imported_source_file(
name,
name_was_remapped,
self.cnum.as_u32(),
src_hash,
name_hash,
source_length,
self.cnum,
lines,
multibyte_chars,
non_narrow_chars,
normalized_pos,
start_pos,
end_pos,
);
debug!(
"CrateMetaData::imported_source_files alloc \
Expand Down
58 changes: 47 additions & 11 deletions src/librustc_metadata/rmeta/encoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ use rustc_ast::ast;
use rustc_ast::attr;
use rustc_span::source_map::Spanned;
use rustc_span::symbol::{kw, sym, Ident, Symbol};
use rustc_span::{self, FileName, SourceFile, Span};
use rustc_span::{self, ExternalSource, FileName, SourceFile, Span};
use std::hash::Hash;
use std::num::NonZeroUsize;
use std::path::Path;
Expand Down Expand Up @@ -167,20 +167,56 @@ impl<'tcx> SpecializedEncoder<Span> for EncodeContext<'tcx> {
return TAG_INVALID_SPAN.encode(self);
}

// HACK(eddyb) there's no way to indicate which crate a Span is coming
// from right now, so decoding would fail to find the SourceFile if
// it's not local to the crate the Span is found in.
if self.source_file_cache.is_imported() {
return TAG_INVALID_SPAN.encode(self);
}
// There are two possible cases here:
// 1. This span comes from a 'foreign' crate - e.g. some crate upstream of the
// crate we are writing metadata for. When the metadata for *this* crate gets
// deserialized, the deserializer will need to know which crate it originally came
// from. We use `TAG_VALID_SPAN_FOREIGN` to indicate that a `CrateNum` should
// be deserialized after the rest of the span data, which tells the deserializer
// which crate contains the source map information.
// 2. This span comes from our own crate. No special hamdling is needed - we just
// write `TAG_VALID_SPAN_LOCAL` to let the deserializer know that it should use
// our own source map information.
let (tag, lo, hi) = if self.source_file_cache.is_imported() {
// To simplify deserialization, we 'rebase' this span onto the crate it originally came from
// (the crate that 'owns' the file it references. These rebased 'lo' and 'hi' values
// are relative to the source map information for the 'foreign' crate whose CrateNum
// we write into the metadata. This allows `imported_source_files` to binary
// search through the 'foreign' crate's source map information, using the
// deserialized 'lo' and 'hi' values directly.
//
// All of this logic ensures that the final result of deserialization is a 'normal'
// Span that can be used without any additional trouble.
let external_start_pos = {
// Introduce a new scope so that we drop the 'lock()' temporary
match &*self.source_file_cache.external_src.lock() {
ExternalSource::Foreign { original_start_pos, .. } => *original_start_pos,
src => panic!("Unexpected external source {:?}", src),
}
};
let lo = (span.lo - self.source_file_cache.start_pos) + external_start_pos;
let hi = (span.hi - self.source_file_cache.start_pos) + external_start_pos;

(TAG_VALID_SPAN_FOREIGN, lo, hi)
} else {
(TAG_VALID_SPAN_LOCAL, span.lo, span.hi)
};

TAG_VALID_SPAN.encode(self)?;
span.lo.encode(self)?;
tag.encode(self)?;
lo.encode(self)?;

// Encode length which is usually less than span.hi and profits more
// from the variable-length integer encoding that we use.
let len = span.hi - span.lo;
len.encode(self)
let len = hi - lo;
len.encode(self)?;

if tag == TAG_VALID_SPAN_FOREIGN {
// This needs to be two lines to avoid holding the `self.source_file_cache`
// while calling `cnum.encode(self)`
let cnum = self.source_file_cache.cnum;
cnum.encode(self)?;
}
Ok(())

// Don't encode the expansion context.
}
Expand Down
5 changes: 3 additions & 2 deletions src/librustc_metadata/rmeta/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -405,5 +405,6 @@ struct GeneratorData<'tcx> {
}

// Tags used for encoding Spans:
const TAG_VALID_SPAN: u8 = 0;
const TAG_INVALID_SPAN: u8 = 1;
const TAG_VALID_SPAN_LOCAL: u8 = 0;
const TAG_VALID_SPAN_FOREIGN: u8 = 1;
const TAG_INVALID_SPAN: u8 = 2;
Loading

0 comments on commit af55daf

Please sign in to comment.