Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM PR reviewer #6381

Merged
merged 31 commits into from
Jan 22, 2025
Merged

LLM PR reviewer #6381

merged 31 commits into from
Jan 22, 2025

Conversation

NachoEchevarria
Copy link
Contributor

@NachoEchevarria NachoEchevarria commented Dec 2, 2024

Summary of changes

This PR adds one stage to the pipeline that writes a report with the code changes of the PR by sending them to OpenAI for a code review. This stage is optional and is only launched when the pipeline variable "generate_llm_report" is set to "true" here. By default, this variable will be set as "false" unless it's decided that we want to add these kind of reports by default.

Also, a report can be generated locally by running the task LLMPRReview
For example:
tracer\build LLMPRReview -GITHUB_TOKEN <GH_TOKEN> -OPEN_AI_KEY <OPEN_AI_KEY> -PullRequestNumber <PRNumber>

This task, when run in local mode, will generate two files:

  • Changes.txt: a file containing the prompt sent to OpenAI
  • Results.txt: the LLM report provided by the OpenAI API call

Reason for change

It's an innovation week project.

Implementation details

Test coverage

Other details

@github-actions github-actions bot added the area:builds project files, build scripts, pipelines, versioning, releases, packages label Dec 2, 2024
@datadog-ddstaging
Copy link

datadog-ddstaging bot commented Dec 2, 2024

Datadog Report

Branch report: nacho/experimentalLlm
Commit report: 820ed6d
Test service: dd-trace-dotnet

✅ 0 Failed, 237879 Passed, 1964 Skipped, 19h 38m 3.35s Total Time
❄️ 1 New Flaky

New Flaky Tests (1)

  • TestInstrumentedUnitTests - Datadog.Trace.Security.IntegrationTests.Iast.IastInstrumentationUnitTests - Last Failure

    Expand for error
     Expected exit code: 0, actual exit code: 1.
    

@andrewlock
Copy link
Member

andrewlock commented Dec 2, 2024

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

  • Welch test with statistical test for significance of 5%
  • Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (69ms)  : 66, 73
     .   : milestone, 69,
    master - mean (69ms)  : 66, 72
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (986ms)  : 957, 1014
     .   : milestone, 986,
    master - mean (982ms)  : 957, 1007
     .   : milestone, 982,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (108ms)  : 106, 110
     .   : milestone, 108,
    master - mean (108ms)  : 105, 110
     .   : milestone, 108,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (684ms)  : 666, 701
     .   : milestone, 684,
    master - mean (681ms)  : 666, 696
     .   : milestone, 681,

Loading
gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (92ms)  : 89, 95
     .   : milestone, 92,
    master - mean (92ms)  : 90, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (639ms)  : 625, 654
     .   : milestone, 639,
    master - mean (634ms)  : 616, 651
     .   : milestone, 634,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (189ms)  : 185, 194
     .   : milestone, 189,
    master - mean (189ms)  : 184, 193
     .   : milestone, 189,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (1,090ms)  : 1055, 1124
     .   : milestone, 1090,
    master - mean (1,082ms)  : 1051, 1114
     .   : milestone, 1082,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (276ms)  : 271, 282
     .   : milestone, 276,
    master - mean (276ms)  : 271, 281
     .   : milestone, 276,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (874ms)  : 848, 900
     .   : milestone, 874,
    master - mean (866ms)  : 835, 897
     .   : milestone, 866,

Loading
gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (265ms)  : 261, 269
     .   : milestone, 265,
    master - mean (263ms)  : 259, 266
     .   : milestone, 263,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (851ms)  : 819, 883
     .   : milestone, 851,
    master - mean (848ms)  : 810, 886
     .   : milestone, 848,

Loading

@andrewlock
Copy link
Member

andrewlock commented Dec 2, 2024

Throughput/Crank Report ⚡

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6381) (11.174M)   : 0, 11174235
    master (11.434M)   : 0, 11433823
    benchmarks/2.9.0 (11.033M)   : 0, 11032866

    section Automatic
    This PR (6381) (7.347M)   : 0, 7346518
    master (7.329M)   : 0, 7329326
    benchmarks/2.9.0 (7.786M)   : 0, 7785853

    section Trace stats
    master (7.611M)   : 0, 7611113

    section Manual
    master (11.108M)   : 0, 11107912

    section Manual + Automatic
    This PR (6381) (6.682M)   : 0, 6681760
    master (6.845M)   : 0, 6844945

    section DD_TRACE_ENABLED=0
    master (10.329M)   : 0, 10328733

Loading
gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6381) (9.275M)   : 0, 9274926
    master (9.534M)   : 0, 9533585
    benchmarks/2.9.0 (9.495M)   : 0, 9494821

    section Automatic
    This PR (6381) (6.415M)   : 0, 6414890
    master (6.293M)   : 0, 6293263

    section Trace stats
    master (6.541M)   : 0, 6541247

    section Manual
    master (9.502M)   : 0, 9502053

    section Manual + Automatic
    This PR (6381) (5.927M)   : 0, 5926822
    master (5.976M)   : 0, 5976365

    section DD_TRACE_ENABLED=0
    master (8.806M)   : 0, 8806055

Loading
gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6381) (9.855M)   : 0, 9854550
    master (9.968M)   : 0, 9968345
    benchmarks/2.9.0 (10.020M)   : 0, 10019592

    section Automatic
    This PR (6381) (6.280M)   : 0, 6280374
    master (6.506M)   : 0, 6506205
    benchmarks/2.9.0 (7.255M)   : 0, 7255257

    section Trace stats
    master (7.120M)   : 0, 7119839

    section Manual
    master (10.011M)   : 0, 10010780

    section Manual + Automatic
    This PR (6381) (5.945M)   : 0, 5944685
    master (5.923M)   : 0, 5922704

    section DD_TRACE_ENABLED=0
    master (9.290M)   : 0, 9290361

Loading

@andrewlock
Copy link
Member

andrewlock commented Dec 2, 2024

Benchmarks Report for tracer 🐌

Benchmarks for #6381 compared to master:

  • 1 benchmarks are slower, with geometric mean 1.156
  • All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

  • Mann–Whitney U test with statistical test for significance of 5%
  • Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartStopWithChild net6.0 8.17μs 46ns 312ns 0.0159 0.00397 0 5.61 KB
master StartStopWithChild netcoreapp3.1 10.2μs 56.5ns 348ns 0.0197 0.00986 0 5.8 KB
master StartStopWithChild net472 16μs 43ns 161ns 1.05 0.322 0.0967 6.22 KB
#6381 StartStopWithChild net6.0 7.95μs 45.4ns 343ns 0.0155 0.00775 0 5.61 KB
#6381 StartStopWithChild netcoreapp3.1 9.88μs 54.1ns 333ns 0.0142 0.00474 0 5.8 KB
#6381 StartStopWithChild net472 16μs 43.8ns 164ns 1.03 0.296 0.0881 6.21 KB
Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 526μs 1.46μs 5.65μs 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 652μs 189ns 653ns 0 0 0 2.7 KB
master WriteAndFlushEnrichedTraces net472 837μs 406ns 1.57μs 0.417 0 0 3.3 KB
#6381 WriteAndFlushEnrichedTraces net6.0 485μs 205ns 769ns 0 0 0 2.7 KB
#6381 WriteAndFlushEnrichedTraces netcoreapp3.1 664μs 537ns 2.08μs 0 0 0 2.7 KB
#6381 WriteAndFlushEnrichedTraces net472 858μs 454ns 1.7μs 0.428 0 0 3.3 KB
Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendRequest net6.0 133μs 506ns 1.96μs 0.186 0 0 14.47 KB
master SendRequest netcoreapp3.1 147μs 333ns 1.29μs 0.218 0 0 17.27 KB
master SendRequest net472 0.00622ns 0.00194ns 0.00751ns 0 0 0 0 b
#6381 SendRequest net6.0 126μs 362ns 1.4μs 0.195 0 0 14.47 KB
#6381 SendRequest netcoreapp3.1 143μs 518ns 2μs 0.218 0 0 17.27 KB
#6381 SendRequest net472 0.0124ns 0.00341ns 0.0132ns 0 0 0 0 b
Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master WriteAndFlushEnrichedTraces net6.0 550μs 2.07μs 7.76μs 0.534 0 0 41.66 KB
master WriteAndFlushEnrichedTraces netcoreapp3.1 683μs 3.81μs 29.7μs 0.322 0 0 41.73 KB
master WriteAndFlushEnrichedTraces net472 839μs 3.82μs 14.8μs 8.45 2.53 0.422 53.29 KB
#6381 WriteAndFlushEnrichedTraces net6.0 541μs 1.19μs 4.45μs 0.553 0 0 41.66 KB
#6381 WriteAndFlushEnrichedTraces netcoreapp3.1 668μs 3.06μs 12.2μs 0.321 0 0 41.71 KB
#6381 WriteAndFlushEnrichedTraces net472 829μs 4.06μs 16.8μs 8.22 2.47 0.411 53.26 KB
Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteNonQuery net6.0 1.32μs 0.563ns 2.03ns 0.014 0 0 1.02 KB
master ExecuteNonQuery netcoreapp3.1 1.77μs 1.87ns 7.23ns 0.0133 0 0 1.02 KB
master ExecuteNonQuery net472 2.07μs 2.57ns 9.95ns 0.157 0.00104 0 987 B
#6381 ExecuteNonQuery net6.0 1.27μs 0.898ns 3.36ns 0.0146 0 0 1.02 KB
#6381 ExecuteNonQuery netcoreapp3.1 1.73μs 1.39ns 5.37ns 0.0138 0 0 1.02 KB
#6381 ExecuteNonQuery net472 2.11μs 2.32ns 8.99ns 0.156 0.00105 0 987 B
Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master CallElasticsearch net6.0 1.18μs 0.872ns 3.38ns 0.0135 0 0 976 B
master CallElasticsearch netcoreapp3.1 1.5μs 0.619ns 2.32ns 0.0135 0 0 976 B
master CallElasticsearch net472 2.55μs 1.56ns 5.82ns 0.158 0 0 995 B
master CallElasticsearchAsync net6.0 1.27μs 0.712ns 2.66ns 0.0133 0 0 952 B
master CallElasticsearchAsync netcoreapp3.1 1.59μs 1.04ns 4.03ns 0.0144 0 0 1.02 KB
master CallElasticsearchAsync net472 2.59μs 1.31ns 4.9ns 0.167 0 0 1.05 KB
#6381 CallElasticsearch net6.0 1.17μs 1.02ns 3.67ns 0.0136 0 0 976 B
#6381 CallElasticsearch netcoreapp3.1 1.53μs 1.32ns 4.94ns 0.013 0 0 976 B
#6381 CallElasticsearch net472 2.55μs 1.91ns 7.4ns 0.157 0 0 995 B
#6381 CallElasticsearchAsync net6.0 1.36μs 0.805ns 3.01ns 0.013 0 0 952 B
#6381 CallElasticsearchAsync netcoreapp3.1 1.7μs 1.54ns 5.98ns 0.014 0 0 1.02 KB
#6381 CallElasticsearchAsync net472 2.56μs 1.11ns 4.14ns 0.166 0 0 1.05 KB
Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master ExecuteAsync net6.0 1.21μs 0.593ns 2.22ns 0.013 0 0 952 B
master ExecuteAsync netcoreapp3.1 1.77μs 1.17ns 4.4ns 0.0131 0 0 952 B
master ExecuteAsync net472 1.89μs 0.508ns 1.97ns 0.145 0 0 915 B
#6381 ExecuteAsync net6.0 1.24μs 0.497ns 1.79ns 0.013 0 0 952 B
#6381 ExecuteAsync netcoreapp3.1 1.6μs 1.26ns 4.73ns 0.0129 0 0 952 B
#6381 ExecuteAsync net472 1.81μs 0.633ns 2.28ns 0.145 0 0 915 B
Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendAsync net6.0 4.52μs 1.39ns 5.21ns 0.0315 0 0 2.31 KB
master SendAsync netcoreapp3.1 5.3μs 4.79ns 18.5ns 0.0368 0 0 2.85 KB
master SendAsync net472 7.4μs 1.43ns 5.35ns 0.495 0 0 3.12 KB
#6381 SendAsync net6.0 4.57μs 1.09ns 3.95ns 0.032 0 0 2.31 KB
#6381 SendAsync netcoreapp3.1 5.21μs 2.41ns 9.32ns 0.0366 0 0 2.85 KB
#6381 SendAsync net472 7.33μs 1.53ns 5.74ns 0.495 0 0 3.12 KB
Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 1.45μs 0.98ns 3.67ns 0.0231 0 0 1.64 KB
master EnrichedLog netcoreapp3.1 2.12μs 1.62ns 6.07ns 0.0222 0 0 1.64 KB
master EnrichedLog net472 2.74μs 1.5ns 5.8ns 0.25 0 0 1.57 KB
#6381 EnrichedLog net6.0 1.45μs 1.03ns 3.99ns 0.0231 0 0 1.64 KB
#6381 EnrichedLog netcoreapp3.1 2.27μs 0.891ns 3.33ns 0.0217 0 0 1.64 KB
#6381 EnrichedLog net472 2.6μs 3.94ns 14.8ns 0.249 0 0 1.57 KB
Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 114μs 189ns 732ns 0.0569 0 0 4.28 KB
master EnrichedLog netcoreapp3.1 119μs 206ns 798ns 0 0 0 4.28 KB
master EnrichedLog net472 149μs 72.5ns 271ns 0.668 0.223 0 4.46 KB
#6381 EnrichedLog net6.0 116μs 129ns 499ns 0.0581 0 0 4.28 KB
#6381 EnrichedLog netcoreapp3.1 120μs 203ns 788ns 0.06 0 0 4.28 KB
#6381 EnrichedLog net472 151μs 214ns 827ns 0.683 0.228 0 4.46 KB
Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 3.01μs 1.07ns 4.13ns 0.0305 0 0 2.2 KB
master EnrichedLog netcoreapp3.1 4.23μs 1.89ns 7.32ns 0.0296 0 0 2.2 KB
master EnrichedLog net472 4.84μs 0.408ns 1.47ns 0.32 0 0 2.02 KB
#6381 EnrichedLog net6.0 2.95μs 0.817ns 3.17ns 0.031 0 0 2.2 KB
#6381 EnrichedLog netcoreapp3.1 4.13μs 1.82ns 7.06ns 0.0288 0 0 2.2 KB
#6381 EnrichedLog net472 4.88μs 1.08ns 4.05ns 0.319 0 0 2.02 KB
Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master SendReceive net6.0 1.33μs 1.18ns 4.41ns 0.0159 0 0 1.14 KB
master SendReceive netcoreapp3.1 1.75μs 5.92ns 22.9ns 0.0158 0 0 1.14 KB
master SendReceive net472 2.12μs 0.722ns 2.8ns 0.183 0 0 1.16 KB
#6381 SendReceive net6.0 1.36μs 1.27ns 4.92ns 0.0162 0 0 1.14 KB
#6381 SendReceive netcoreapp3.1 1.75μs 5.58ns 21.6ns 0.0155 0 0 1.14 KB
#6381 SendReceive net472 2.02μs 3.03ns 11.3ns 0.183 0 0 1.16 KB
Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master EnrichedLog net6.0 2.71μs 0.648ns 2.34ns 0.0216 0 0 1.6 KB
master EnrichedLog netcoreapp3.1 3.85μs 1.65ns 5.94ns 0.0214 0 0 1.65 KB
master EnrichedLog net472 4.43μs 3.58ns 13.8ns 0.324 0 0 2.04 KB
#6381 EnrichedLog net6.0 2.73μs 1.03ns 3.98ns 0.0216 0 0 1.6 KB
#6381 EnrichedLog netcoreapp3.1 3.76μs 2.16ns 8.38ns 0.0226 0 0 1.65 KB
#6381 EnrichedLog net472 4.52μs 3.16ns 12.2ns 0.322 0 0 2.04 KB
Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6381

Benchmark diff/base Base Median (ns) Diff Median (ns) Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0 1.156 400.90 463.55

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master StartFinishSpan net6.0 401ns 0.496ns 1.92ns 0.00815 0 0 576 B
master StartFinishSpan netcoreapp3.1 568ns 1.8ns 6.98ns 0.00775 0 0 576 B
master StartFinishSpan net472 604ns 0.974ns 3.77ns 0.0918 0 0 578 B
master StartFinishScope net6.0 477ns 0.702ns 2.72ns 0.00967 0 0 696 B
master StartFinishScope netcoreapp3.1 665ns 1.03ns 3.97ns 0.00935 0 0 696 B
master StartFinishScope net472 874ns 1.41ns 5.47ns 0.104 0 0 658 B
#6381 StartFinishSpan net6.0 463ns 0.701ns 2.71ns 0.00804 0 0 576 B
#6381 StartFinishSpan netcoreapp3.1 623ns 1.28ns 4.95ns 0.00766 0 0 576 B
#6381 StartFinishSpan net472 622ns 0.997ns 3.86ns 0.0916 0 0 578 B
#6381 StartFinishScope net6.0 484ns 0.846ns 3.28ns 0.00973 0 0 696 B
#6381 StartFinishScope netcoreapp3.1 717ns 1.64ns 6.14ns 0.00932 0 0 696 B
#6381 StartFinishScope net472 853ns 1.46ns 5.67ns 0.104 0 0 658 B
Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch Method Toolchain Mean StdError StdDev Gen 0 Gen 1 Gen 2 Allocated
master RunOnMethodBegin net6.0 646ns 0.639ns 2.48ns 0.00962 0 0 696 B
master RunOnMethodBegin netcoreapp3.1 905ns 1.14ns 4.43ns 0.00912 0 0 696 B
master RunOnMethodBegin net472 1.11μs 2.23ns 8.64ns 0.104 0 0 658 B
#6381 RunOnMethodBegin net6.0 623ns 0.734ns 2.84ns 0.0096 0 0 696 B
#6381 RunOnMethodBegin netcoreapp3.1 950ns 1.18ns 4.58ns 0.00907 0 0 696 B
#6381 RunOnMethodBegin net472 1.05μs 1.87ns 7.25ns 0.104 0 0 658 B

@DataDog DataDog deleted a comment from github-actions bot Dec 12, 2024
@NachoEchevarria NachoEchevarria marked this pull request as ready for review December 12, 2024 16:47
@NachoEchevarria NachoEchevarria requested a review from a team as a code owner December 12, 2024 16:47
@NachoEchevarria NachoEchevarria changed the title Nacho/experimental llm LLM PR reviewer Dec 12, 2024
Copy link
Member

@andrewlock andrewlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some nits

@@ -254,6 +254,39 @@ stages:
displayName: Generate Matrices
name: generate_variables_step

- stage: generate_LLM_Report
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we please put this at the bottom of the file, as it's entirely optional? 🥺

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines 264 to 288
- template: steps/update-github-status-jobs.yml
parameters:
jobs: [generate_LLM_job]

- job: generate_LLM_job
timeoutInMinutes: 3
dependsOn: []
pool:
name: azure-windows-scale-set-3

steps:
- template: steps/clone-repo.yml
parameters:
targetShaId: $(targetShaId)
targetBranch: $(targetBranch)
- template: steps/install-latest-dotnet-sdk.yml

- powershell: |
tracer/build.ps1 LLMReport
displayName: Generate LLM report
name: generate_llm_step
env:
PullRequestNumber: $(System.PullRequest.PullRequestNumber)
GITHUB_TOKEN: $(GITHUB_TOKEN)
OpenAIKey: $(OPEN_AI_KEY)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the wrong value I think, should be OPEN_AI_KEY based on the Nuke code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's defined as
[Parameter("An OpenAI key", Name = "OPEN_AI_KEY")]
readonly string OpenAIKey;

It seems to be working...

Comment on lines 151 to 154
else if (string.IsNullOrEmpty(OpenAIKey))
{
result = "Null or empty OpenAI key.";
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't happen because you marked it Required()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done!


if (executeLocal)
{
File.WriteAllText("changes.txt", fullPrompt);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should make this absolute paths, so the location of changes.txt is fixed and in a sensible (i.e. temporary) place. That way it will be excluded by the gitignore (same goes for the LLMResult.txt).

Personally, I'd rather we didn't generate these files, and instead just print them to the console. You can always pipe them to a file if you want to anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Thanks! Done!

@@ -94,6 +99,92 @@ await client.Issue.Update(
Console.WriteLine($"PR assigned");
});

Target LLMReport => _ => _
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should rename it so that it's clear it's only a review of a PR? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed the target to LLMPRReview. Thanks!

@DataDog DataDog deleted a comment from andrewlock Jan 20, 2025
Copy link
Member

@e-n-0 e-n-0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I'll try it in some of my PRs 😄


var requestContent = new
{
model = "gpt-4o",
Copy link
Member

@e-n-0 e-n-0 Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the result really tied to the model used? (like does it do any difference with o1 and the thinking pattern?)
We can maybe in the future specify the openAI model as an arg if other new models get released

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I chose this model as a compromise between results and price, but adding the model as an optional argument could be a nice feature. Thanks!

@NachoEchevarria
Copy link
Contributor Author

Thanks for your feedback and reviews!

@NachoEchevarria NachoEchevarria merged commit fe5fbb5 into master Jan 22, 2025
125 of 127 checks passed
@NachoEchevarria NachoEchevarria deleted the nacho/experimentalLlm branch January 22, 2025 17:21
@github-actions github-actions bot added this to the vNext-v3 milestone Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:builds project files, build scripts, pipelines, versioning, releases, packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants