Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang] replaced the usage of asctime with std::put_time #99075

Merged
merged 1 commit into from
Jul 24, 2024

Conversation

yichi170
Copy link
Contributor

@yichi170 yichi170 commented Jul 16, 2024

In clang/lib/Lex/PPMacroExpansion.cpp, replaced the usage of the obsolete asctime function with std::put_time for generating timestamp strings.

Fixes: #98724

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jul 16, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2024

@llvm/pr-subscribers-clang

Author: Yi-Chi Lee (yichi170)

Changes

In clang/lib/Lex/PPMacroExpansion.cpp, replaced the usage of the obsolete asctime function with strftime for generating timestamp strings.

This is my first time contributing to an open-source project. If there is anything that I can improve (such as code quality), please let me know!

Fixes: #98724


Full diff: https://github.com/llvm/llvm-project/pull/99075.diff

1 Files Affected:

  • (modified) clang/lib/Lex/PPMacroExpansion.cpp (+7-5)
diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index 3913ff08c2eb5..f7e657be87932 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -1722,10 +1722,12 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
     // MSVC, ICC, GCC, VisualAge C++ extension.  The generated string should be
     // of the form "Ddd Mmm dd hh::mm::ss yyyy", which is returned by asctime.
     const char *Result;
+    char TimeString[std::size("Ddd Mmm dd hh:mm:ss yyyy")];
     if (getPreprocessorOpts().SourceDateEpoch) {
       time_t TT = *getPreprocessorOpts().SourceDateEpoch;
       std::tm *TM = std::gmtime(&TT);
-      Result = asctime(TM);
+      strftime(TimeString, std::size(TimeString), "%c", TM);
+      Result = TimeString;
     } else {
       // Get the file that we are lexing out of.  If we're currently lexing from
       // a macro, dig into the include stack.
@@ -1735,13 +1737,13 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
       if (CurFile) {
         time_t TT = CurFile->getModificationTime();
         struct tm *TM = localtime(&TT);
-        Result = asctime(TM);
+        strftime(TimeString, std::size(TimeString), "%c", TM);
+        Result = TimeString;
       } else {
-        Result = "??? ??? ?? ??:??:?? ????\n";
+        Result = "??? ??? ?? ??:??:?? ????";
       }
     }
-    // Surround the string with " and strip the trailing newline.
-    OS << '"' << StringRef(Result).drop_back() << '"';
+    OS << '"' << Result << '"';
     Tok.setKind(tok::string_literal);
   } else if (II == Ident__FLT_EVAL_METHOD__) {
     // __FLT_EVAL_METHOD__ is set to the default value.

@shafik shafik requested review from AaronBallman and MaskRay July 16, 2024 21:01
if (getPreprocessorOpts().SourceDateEpoch) {
time_t TT = *getPreprocessorOpts().SourceDateEpoch;
std::tm *TM = std::gmtime(&TT);
Result = asctime(TM);
strftime(TimeString, std::size(TimeString), "%c", TM);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to check the result of strftime is not 0 which would indicate that it ran out of space and we need to recover.

Copy link
Contributor Author

@yichi170 yichi170 Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Do you mean if the result of strftime is zero, Result have to be set as "???...?"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's three problems here:

  1. We need to look at the return value from strftime() and if it's 0, fallback to the failure mode.
  2. Different locales may require a different buffer size.
  3. %c isn't specified to return the same thing as asctime() though it frequently will.

So I think TimeString should probably have a hard-coded buffer size that's a bit bigger than we expect is necessary, like char TimeString[64];, and the format string to strftime() should be %a %b %d %T %Y, and we should check the return value of the call to strftime() and fall back to ?? .. ? if the value is 0.

@cor3ntin
Copy link
Contributor

cor3ntin commented Jul 17, 2024

asctime is deprecated. But asctime_s is not. See https://en.cppreference.com/w/c/chrono/asctime

using strftime is going to be challenging because of locale (ie we would have to temporarily switch to a C locale - both because we want a locale independent behavior, and because for an arbitrary locale the buffer might need to be bigger than for the C locale.

(an alternative would be to use the C++ interface https://compiler-explorer.com/z/WEqTW341b )

Either way, we need to make sure sure the produced string are identical, I'm sure people rely on that.

Edit: asctime_s is defined in annex K so we can't use it as this is poorly supported

@yichi170
Copy link
Contributor Author

I submitted a commit that used the original way with error handling and fixed-sized TimeString.

(an alternative would be to use the C++ interface https://compiler-explorer.com/z/WEqTW341b )

Since I'm not familiar with the issue related to locale, I'm not sure which method is better. If the C++ interface implementation mentioned by @cor3ntin is preferable, I can modify the current implementation.

if (getPreprocessorOpts().SourceDateEpoch) {
time_t TT = *getPreprocessorOpts().SourceDateEpoch;
std::tm *TM = std::gmtime(&TT);
Result = asctime(TM);
if (strftime(TimeString, sizeof TimeString, "%a %b %e %T %Y", TM) != 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AaronBallman I adapted your advice and here I use %e instead of %d since asctime returns the space-padded Day of the month.

clang/lib/Lex/PPMacroExpansion.cpp Show resolved Hide resolved
if (getPreprocessorOpts().SourceDateEpoch) {
time_t TT = *getPreprocessorOpts().SourceDateEpoch;
std::tm *TM = std::gmtime(&TT);
Result = asctime(TM);
if (strftime(TimeString, sizeof TimeString, "%a %b %e %T %Y", TM) != 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so the last remaining problem is that of locale. I didn't realize that asctime() is locale independent, which means that switching to strftime() will change the output for some users. :-(

We could use setlocale() before the call to strftime() and then reset the locale after the call, but another approach would be to refactor ComputeDATE_TIME which does all the formatting manually and in a locale-independent way. Given what a pain locales are in practice, I sort of think the refactoring is actually a better approach. WDYT @cor3ntin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why i suggested the C++ approach here #99075 (comment)

(wouldn't it be nice if the C standard provided overloads of everything that takes explicit local parameters to avoid having to do these sort of dances!)

Copy link
Contributor Author

@yichi170 yichi170 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that using %b in put_time is also locale-dependent. Is it expected?
ref: https://en.cppreference.com/w/cpp/io/manip/put_time

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHOOT, yes, that is expected, same for %a.

This is why i suggested the C++ approach here #99075 (comment)

We can't use that approach because we need to compile in C++17 mode: https://compiler-explorer.com/z/xosc68WMn

Okay, so how's this for horrible:

bool Failed = false;
if (const char *OldLocale = ::setlocale(LC_TIME, "C")) {
  if (strftime(...)) {
  } else {
    Failed = true;
  }
  ::setlocale(LC_TIME, OldLocale);
} else {
  Failed = true;
}

if (Failed) {
  // Emit question marks instead
}

Copy link
Contributor

@cor3ntin cor3ntin Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AaronBallman the std::print thing was illustrative, this works in C++17 (well, 11)

    std::stringstream buf;
    // this is the line that ensure we are not affected by locales 
    buf.imbue(std::locale("C"));
    std::time_t t = std::time(nullptr);
    std::tm tm = *std::localtime(&t);
    buf << std::put_time(&tm, "%c");
    std::string str = buf.str(); // do whatever with that string

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay, thanks! Hmm, if we have to mess with locales either way, this seems reasonable to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

asctime_r might be the best replacement, but it seems unavailable on MSVC.
MSVC has asctime_s.

@yichi170
Copy link
Contributor Author

I implemented with std::put_time instead, and setstate for further checking whether std::put_time is successful or not.

std::string Result = "??? ??? ?? ??:??:?? ????";
std::stringstream TmpStream;
TmpStream.imbue(std::locale("C"));
TmpStream.setstate(std::ios_base::badbit);
Copy link
Contributor

@cor3ntin cor3ntin Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not needed afaik (the bad bit would be set automatically on failure) ... oh wait are you trying to check that one of the two code path was taken? I think it would be cleaner to either use a boolean for that, or materialize TmpStream.str(); and check whether it's empty (that way you can get rid of the setstate/clear calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed it. Thanks for the advice.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, the changes LGTM assuming precommit CI doesn't discover anything. @cor3ntin are you happy with the changes as well?

@yichi170 yichi170 changed the title [clang] replaced the usage of asctime with strftime [clang] replaced the usage of asctime with std::put_time Jul 24, 2024
@yichi170
Copy link
Contributor Author

I don't have write access, so if the PR is ready, please help me land it!
If there is anything that I need to do, please let me know!

@yichi170 yichi170 force-pushed the asctime-to-strftime branch from 5a8c840 to 60a8a36 Compare July 24, 2024 03:16
@cor3ntin
Copy link
Contributor

LGTM too!
Thanks for the fix :)

@cor3ntin cor3ntin merged commit fad17b4 into llvm:main Jul 24, 2024
6 checks passed
Copy link

@yichi170 Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested
by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as
the builds can include changes from many authors. It is not uncommon for your
change to be included in a build that fails due to someone else's changes, or
infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself.
This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@rorth
Copy link
Collaborator

rorth commented Jul 24, 2024

This seems to have broken the Solaris/amd64 and Solaris/sparcv9 buildbots.

@rorth
Copy link
Collaborator

rorth commented Jul 25, 2024

Confirmed: reverting the change locally restores the builds, although I don't yet see why.

@AaronBallman
Copy link
Collaborator

Confirmed: reverting the change locally restores the builds, although I don't yet see why.

The fact that there's a missing symbol suggests the STL on the machine is not conforming... the symbol that's missing (_ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPKSt2tmPKcSB_) demangles to: std::time_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, std::tm const*, char const*, char const*) const which has been around since C++98.

@AaronBallman
Copy link
Collaborator

Aha!

https://bugs.llvm.org/show_bug.cgi?id=33767

We have a mixture of both std::tm and ::tm in here, try switching to using ::tm and see if that helps.

@rorth
Copy link
Collaborator

rorth commented Jul 25, 2024

Aha!

https://bugs.llvm.org/show_bug.cgi?id=33767

Ah, old sins (from the GCC side no less) keep haunting me.

We have a mixture of both std::tm and ::tm in here, try switching to using ::tm and see if that helps.

Unfortunately not: I always get the undefined reference to the std::tm const * version, no matter what I tried.

@AaronBallman
Copy link
Collaborator

We have a mixture of both std::tm and ::tm in here, try switching to using ::tm and see if that helps.

Unfortunately not: I always get the undefined reference to the std::tm const * version, no matter what I tried.

Drat!

If you dump the symbols from the STL shared library on the system, is there one for time_put at all? If so, what is the mangled symbol it exports?

(As a perhaps terrible idea, could we use the alias attribute on a redeclaration to try to force to link against the correct mangling for Solaris only?)

@rorth
Copy link
Collaborator

rorth commented Jul 25, 2024

If you dump the symbols from the STL shared library on the system, is there one for time_put at all? If so, what is the mangled symbol it exports?

While the source requires

_ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPKSt2tmPKcSB_

std::time_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, std::tm const*, char const*, char const*) const

libstdc++.so.6 only provides

_ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPK2tmPKcSB_

std::time_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::put(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, tm const*, char const*, char const*) const

i.e. tm const * instead of std::tm const *.

(As a perhaps terrible idea, could we use the alias attribute on a redeclaration to try to force to link against the correct mangling for Solaris only?)

A hack along the lines of

asm("_ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPKSt2tmPKcSB_ = _ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPK2tmPKcSB_");

at least allows clang to link.

@AaronBallman
Copy link
Collaborator

One last thing to try -- what happens if you replace ::tm and std::tm with struct tm? Does that happen to get you the correct symbol (hoping the elaborated type name makes a difference)?

@rorth
Copy link
Collaborator

rorth commented Jul 25, 2024

Unfortunately not.

@AaronBallman
Copy link
Collaborator

Blech, thank you for trying! I guess that unless there's some way to hide this ugliness in the cmake scripts or someone else has better ideas to try, the asm aliasing (on Solaris only) may be our best path forward.

@rorth
Copy link
Collaborator

rorth commented Jul 25, 2024

At least in the short term. ISTM that clang++ may need to do something along the lines of Bug 33767 (resp. the underlying GCC bug/patch), otherwise users are in for a nasty surprise.

@AaronBallman
Copy link
Collaborator

At least in the short term. ISTM that clang++ may need to do something along the lines of Bug 33767 (resp. the underlying GCC bug/patch), otherwise users are in for a nasty surprise.

Agreed; long-term we need a better solution. Hopefully someone maintaining Solaris support can step in and help there, because this is quite surprising behavior.

yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
In `clang/lib/Lex/PPMacroExpansion.cpp`, replaced the usage of the
obsolete `asctime` function with `std::put_time` for generating
timestamp strings.

Fixes: #98724

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250729
@yichi170
Copy link
Contributor Author

Thanks for identifying the problem. @AaronBallman @rorth

@rorth
Copy link
Collaborator

rorth commented Jul 26, 2024

At least in the short term. ISTM that clang++ may need to do something along the lines of Bug 33767 (resp. the underlying GCC bug/patch), otherwise users are in for a nasty surprise.

Agreed; long-term we need a better solution. Hopefully someone maintaining Solaris support can step in and help there, because this is quite surprising behavior.

The cleaned-up hack is in the final rounds of testing. Wlll create a PR once that's done.

There's no real Solaris maintainer AFAIK: while I do run the buildbots and try to keep the port in shape as time allows, I can only do so much. The Solaris GCC maintainership takes enough of my time and splitting it would only be to the detriment of both projects.

TBH, I wouldn't even know where to look: I've mostly dealt with compiler-rt and driver issues so far.

rorth added a commit to rorth/llvm-project that referenced this pull request Jul 26, 2024
The introduction of `std::put_time` in
fad17b4 broke the Solaris build:
```
Undefined			first referenced
 symbol  			    in file
_ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPKSt2tmPKcSB_ lib/libclangLex.a(PPMacroExpansion.cpp.o)
ld: fatal: symbol referencing errors
```
As discussed in PR llvm#99075, the problem is that GCC mangles `std::tm` as
`tm` on Solaris for binary compatility, while `clang` doesn't (Issue

As a stop-gap measure, this patch introduces an `asm` level alias to the
same effect.

Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
rorth added a commit that referenced this pull request Jul 26, 2024
The introduction of `std::put_time` in
fad17b4 broke the Solaris build:
```
Undefined			first referenced
 symbol  			    in file
_ZNKSt8time_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEEE3putES3_RSt8ios_basecPKSt2tmPKcSB_ lib/libclangLex.a(PPMacroExpansion.cpp.o)
ld: fatal: symbol referencing errors
```
As discussed in PR #99075, the problem is that GCC mangles `std::tm` as
`tm` on Solaris for binary compatility, while `clang` doesn't (Issue
#33114).

As a stop-gap measure, this patch introduces an `asm` level alias to the
same effect.

Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and
`x86_64-pc-linux-gnu`.
Harini0924 pushed a commit to Harini0924/llvm-project that referenced this pull request Aug 1, 2024
)

In `clang/lib/Lex/PPMacroExpansion.cpp`, replaced the usage of the
obsolete `asctime` function with `std::put_time` for generating
timestamp strings.

Fixes: llvm#98724
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clang/lib/Lex/PPMacroExpansion.cpp: 2 * calls to obsolete function asctime
7 participants