LLM PR reviewer #6381

NachoEchevarria · 2024-12-02T11:19:12Z

Summary of changes

This PR adds one stage to the pipeline that writes a report with the code changes of the PR by sending them to OpenAI for a code review. This stage is optional and is only launched when the pipeline variable "generate_llm_report" is set to "true" here. By default, this variable will be set as "false" unless it's decided that we want to add these kind of reports by default.

Also, a report can be generated locally by running the task LLMPRReview
For example:
tracer\build LLMPRReview -GITHUB_TOKEN <GH_TOKEN> -OPEN_AI_KEY <OPEN_AI_KEY> -PullRequestNumber <PRNumber>

This task, when run in local mode, will generate two files:

Changes.txt: a file containing the prompt sent to OpenAI
Results.txt: the LLM report provided by the OpenAI API call

Reason for change

It's an innovation week project.

Implementation details

Test coverage

Other details

datadog-ddstaging · 2024-12-02T12:01:58Z

Datadog Report

Branch report: nacho/experimentalLlm
Commit report: 820ed6d
Test service: dd-trace-dotnet

✅ 0 Failed, 237879 Passed, 1964 Skipped, 19h 38m 3.35s Total Time
❄️ 1 New Flaky

New Flaky Tests (1)

TestInstrumentedUnitTests - Datadog.Trace.Security.IntegrationTests.Iast.IastInstrumentationUnitTests - Last Failure
Expand for error
```
 Expected exit code: 0, actual exit code: 1.
```

andrewlock · 2024-12-02T12:12:54Z

Execution-Time Benchmarks Report ⏱️

Execution-time results for samples comparing the following branches/commits:

Execution-time benchmarks measure the whole time it takes to execute a program. And are intended to measure the one-off costs. Cases where the execution time results for the PR are worse than latest master results are shown in red. The following thresholds were used for comparing the execution times:

Welch test with statistical test for significance of 5%
Only results indicating a difference greater than 5% and 5 ms are considered.

Note that these results are based on a single point-in-time result for each branch. For full results, see the dashboard.

Graphs show the p99 interval based on the mean and StdDev of the test run, as well as the mean value of the run (shown as a diamond below the graph).

gantt
    title Execution time (ms) FakeDbCommand (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (69ms)  : 66, 73
     .   : milestone, 69,
    master - mean (69ms)  : 66, 72
     .   : milestone, 69,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (986ms)  : 957, 1014
     .   : milestone, 986,
    master - mean (982ms)  : 957, 1007
     .   : milestone, 982,

gantt
    title Execution time (ms) FakeDbCommand (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (108ms)  : 106, 110
     .   : milestone, 108,
    master - mean (108ms)  : 105, 110
     .   : milestone, 108,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (684ms)  : 666, 701
     .   : milestone, 684,
    master - mean (681ms)  : 666, 696
     .   : milestone, 681,

gantt
    title Execution time (ms) FakeDbCommand (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (92ms)  : 89, 95
     .   : milestone, 92,
    master - mean (92ms)  : 90, 94
     .   : milestone, 92,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (639ms)  : 625, 654
     .   : milestone, 639,
    master - mean (634ms)  : 616, 651
     .   : milestone, 634,

gantt
    title Execution time (ms) HttpMessageHandler (.NET Framework 4.6.2) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (189ms)  : 185, 194
     .   : milestone, 189,
    master - mean (189ms)  : 184, 193
     .   : milestone, 189,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (1,090ms)  : 1055, 1124
     .   : milestone, 1090,
    master - mean (1,082ms)  : 1051, 1114
     .   : milestone, 1082,

gantt
    title Execution time (ms) HttpMessageHandler (.NET Core 3.1) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (276ms)  : 271, 282
     .   : milestone, 276,
    master - mean (276ms)  : 271, 281
     .   : milestone, 276,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (874ms)  : 848, 900
     .   : milestone, 874,
    master - mean (866ms)  : 835, 897
     .   : milestone, 866,

gantt
    title Execution time (ms) HttpMessageHandler (.NET 6) 
    dateFormat  X
    axisFormat %s
    todayMarker off
    section Baseline
    This PR (6381) - mean (265ms)  : 261, 269
     .   : milestone, 265,
    master - mean (263ms)  : 259, 266
     .   : milestone, 263,

    section CallTarget+Inlining+NGEN
    This PR (6381) - mean (851ms)  : 819, 883
     .   : milestone, 851,
    master - mean (848ms)  : 810, 886
     .   : milestone, 848,

andrewlock · 2024-12-02T12:37:46Z

Throughput/Crank Report ⚡

Throughput results for AspNetCoreSimpleController comparing the following branches/commits:

Cases where throughput results for the PR are worse than latest master (5% drop or greater), results are shown in red.

Note that these results are based on a single point-in-time result for each branch. For full results, see one of the many, many dashboards!

gantt
    title Throughput Linux x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6381) (11.174M)   : 0, 11174235
    master (11.434M)   : 0, 11433823
    benchmarks/2.9.0 (11.033M)   : 0, 11032866

    section Automatic
    This PR (6381) (7.347M)   : 0, 7346518
    master (7.329M)   : 0, 7329326
    benchmarks/2.9.0 (7.786M)   : 0, 7785853

    section Trace stats
    master (7.611M)   : 0, 7611113

    section Manual
    master (11.108M)   : 0, 11107912

    section Manual + Automatic
    This PR (6381) (6.682M)   : 0, 6681760
    master (6.845M)   : 0, 6844945

    section DD_TRACE_ENABLED=0
    master (10.329M)   : 0, 10328733

gantt
    title Throughput Linux arm64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6381) (9.275M)   : 0, 9274926
    master (9.534M)   : 0, 9533585
    benchmarks/2.9.0 (9.495M)   : 0, 9494821

    section Automatic
    This PR (6381) (6.415M)   : 0, 6414890
    master (6.293M)   : 0, 6293263

    section Trace stats
    master (6.541M)   : 0, 6541247

    section Manual
    master (9.502M)   : 0, 9502053

    section Manual + Automatic
    This PR (6381) (5.927M)   : 0, 5926822
    master (5.976M)   : 0, 5976365

    section DD_TRACE_ENABLED=0
    master (8.806M)   : 0, 8806055

gantt
    title Throughput Windows x64 (Total requests) 
    dateFormat  X
    axisFormat %s
    section Baseline
    This PR (6381) (9.855M)   : 0, 9854550
    master (9.968M)   : 0, 9968345
    benchmarks/2.9.0 (10.020M)   : 0, 10019592

    section Automatic
    This PR (6381) (6.280M)   : 0, 6280374
    master (6.506M)   : 0, 6506205
    benchmarks/2.9.0 (7.255M)   : 0, 7255257

    section Trace stats
    master (7.120M)   : 0, 7119839

    section Manual
    master (10.011M)   : 0, 10010780

    section Manual + Automatic
    This PR (6381) (5.945M)   : 0, 5944685
    master (5.923M)   : 0, 5922704

    section DD_TRACE_ENABLED=0
    master (9.290M)   : 0, 9290361

andrewlock · 2024-12-02T12:58:32Z

Benchmarks Report for tracer 🐌

Benchmarks for #6381 compared to master:

1 benchmarks are slower, with geometric mean 1.156
All benchmarks have the same allocations

The following thresholds were used for comparing the benchmark speeds:

Mann–Whitney U test with statistical test for significance of 5%
Only results indicating a difference greater than 10% and 0.3 ns are considered.

Allocation changes below 0.5% are ignored.

Benchmark details

Benchmarks.Trace.ActivityBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Gen 2	Allocated
master	`StartStopWithChild`	net6.0	8.17μs	46ns	312ns	0.0159	0.00397	0	5.61 KB
master	`StartStopWithChild`	netcoreapp3.1	10.2μs	56.5ns	348ns	0.0197	0.00986	0	5.8 KB
master	`StartStopWithChild`	net472	16μs	43ns	161ns	1.05	0.322	0.0967	6.22 KB
#6381	`StartStopWithChild`	net6.0	7.95μs	45.4ns	343ns	0.0155	0.00775	0	5.61 KB
#6381	`StartStopWithChild`	netcoreapp3.1	9.88μs	54.1ns	333ns	0.0142	0.00474	0	5.8 KB
#6381	`StartStopWithChild`	net472	16μs	43.8ns	164ns	1.03	0.296	0.0881	6.21 KB

Benchmarks.Trace.AgentWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`WriteAndFlushEnrichedTraces`	net6.0	526μs	1.46μs	5.65μs	0	2.7 KB
master	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	652μs	189ns	653ns	0	2.7 KB
master	`WriteAndFlushEnrichedTraces`	net472	837μs	406ns	1.57μs	0.417	3.3 KB
#6381	`WriteAndFlushEnrichedTraces`	net6.0	485μs	205ns	769ns	0	2.7 KB
#6381	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	664μs	537ns	2.08μs	0	2.7 KB
#6381	`WriteAndFlushEnrichedTraces`	net472	858μs	454ns	1.7μs	0.428	3.3 KB

Benchmarks.Trace.AspNetCoreBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`SendRequest`	net6.0	133μs	506ns	1.96μs	0.186	14.47 KB
master	`SendRequest`	netcoreapp3.1	147μs	333ns	1.29μs	0.218	17.27 KB
master	`SendRequest`	net472	0.00622ns	0.00194ns	0.00751ns	0	0 b
#6381	`SendRequest`	net6.0	126μs	362ns	1.4μs	0.195	14.47 KB
#6381	`SendRequest`	netcoreapp3.1	143μs	518ns	2μs	0.218	17.27 KB
#6381	`SendRequest`	net472	0.0124ns	0.00341ns	0.0132ns	0	0 b

Benchmarks.Trace.CIVisibilityProtocolWriterBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Gen 2	Allocated
master	`WriteAndFlushEnrichedTraces`	net6.0	550μs	2.07μs	7.76μs	0.534	0	0	41.66 KB
master	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	683μs	3.81μs	29.7μs	0.322	0	0	41.73 KB
master	`WriteAndFlushEnrichedTraces`	net472	839μs	3.82μs	14.8μs	8.45	2.53	0.422	53.29 KB
#6381	`WriteAndFlushEnrichedTraces`	net6.0	541μs	1.19μs	4.45μs	0.553	0	0	41.66 KB
#6381	`WriteAndFlushEnrichedTraces`	netcoreapp3.1	668μs	3.06μs	12.2μs	0.321	0	0	41.71 KB
#6381	`WriteAndFlushEnrichedTraces`	net472	829μs	4.06μs	16.8μs	8.22	2.47	0.411	53.26 KB

Benchmarks.Trace.DbCommandBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Allocated
master	`ExecuteNonQuery`	net6.0	1.32μs	0.563ns	2.03ns	0.014	0	1.02 KB
master	`ExecuteNonQuery`	netcoreapp3.1	1.77μs	1.87ns	7.23ns	0.0133	0	1.02 KB
master	`ExecuteNonQuery`	net472	2.07μs	2.57ns	9.95ns	0.157	0.00104	987 B
#6381	`ExecuteNonQuery`	net6.0	1.27μs	0.898ns	3.36ns	0.0146	0	1.02 KB
#6381	`ExecuteNonQuery`	netcoreapp3.1	1.73μs	1.39ns	5.37ns	0.0138	0	1.02 KB
#6381	`ExecuteNonQuery`	net472	2.11μs	2.32ns	8.99ns	0.156	0.00105	987 B

Benchmarks.Trace.ElasticsearchBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`CallElasticsearch`	net6.0	1.18μs	0.872ns	3.38ns	0.0135	976 B
master	`CallElasticsearch`	netcoreapp3.1	1.5μs	0.619ns	2.32ns	0.0135	976 B
master	`CallElasticsearch`	net472	2.55μs	1.56ns	5.82ns	0.158	995 B
master	`CallElasticsearchAsync`	net6.0	1.27μs	0.712ns	2.66ns	0.0133	952 B
master	`CallElasticsearchAsync`	netcoreapp3.1	1.59μs	1.04ns	4.03ns	0.0144	1.02 KB
master	`CallElasticsearchAsync`	net472	2.59μs	1.31ns	4.9ns	0.167	1.05 KB
#6381	`CallElasticsearch`	net6.0	1.17μs	1.02ns	3.67ns	0.0136	976 B
#6381	`CallElasticsearch`	netcoreapp3.1	1.53μs	1.32ns	4.94ns	0.013	976 B
#6381	`CallElasticsearch`	net472	2.55μs	1.91ns	7.4ns	0.157	995 B
#6381	`CallElasticsearchAsync`	net6.0	1.36μs	0.805ns	3.01ns	0.013	952 B
#6381	`CallElasticsearchAsync`	netcoreapp3.1	1.7μs	1.54ns	5.98ns	0.014	1.02 KB
#6381	`CallElasticsearchAsync`	net472	2.56μs	1.11ns	4.14ns	0.166	1.05 KB

Benchmarks.Trace.GraphQLBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`ExecuteAsync`	net6.0	1.21μs	0.593ns	2.22ns	0.013	952 B
master	`ExecuteAsync`	netcoreapp3.1	1.77μs	1.17ns	4.4ns	0.0131	952 B
master	`ExecuteAsync`	net472	1.89μs	0.508ns	1.97ns	0.145	915 B
#6381	`ExecuteAsync`	net6.0	1.24μs	0.497ns	1.79ns	0.013	952 B
#6381	`ExecuteAsync`	netcoreapp3.1	1.6μs	1.26ns	4.73ns	0.0129	952 B
#6381	`ExecuteAsync`	net472	1.81μs	0.633ns	2.28ns	0.145	915 B

Benchmarks.Trace.HttpClientBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`SendAsync`	net6.0	4.52μs	1.39ns	5.21ns	0.0315	2.31 KB
master	`SendAsync`	netcoreapp3.1	5.3μs	4.79ns	18.5ns	0.0368	2.85 KB
master	`SendAsync`	net472	7.4μs	1.43ns	5.35ns	0.495	3.12 KB
#6381	`SendAsync`	net6.0	4.57μs	1.09ns	3.95ns	0.032	2.31 KB
#6381	`SendAsync`	netcoreapp3.1	5.21μs	2.41ns	9.32ns	0.0366	2.85 KB
#6381	`SendAsync`	net472	7.33μs	1.53ns	5.74ns	0.495	3.12 KB

Benchmarks.Trace.ILoggerBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`EnrichedLog`	net6.0	1.45μs	0.98ns	3.67ns	0.0231	1.64 KB
master	`EnrichedLog`	netcoreapp3.1	2.12μs	1.62ns	6.07ns	0.0222	1.64 KB
master	`EnrichedLog`	net472	2.74μs	1.5ns	5.8ns	0.25	1.57 KB
#6381	`EnrichedLog`	net6.0	1.45μs	1.03ns	3.99ns	0.0231	1.64 KB
#6381	`EnrichedLog`	netcoreapp3.1	2.27μs	0.891ns	3.33ns	0.0217	1.64 KB
#6381	`EnrichedLog`	net472	2.6μs	3.94ns	14.8ns	0.249	1.57 KB

Benchmarks.Trace.Log4netBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Gen 1	Allocated
master	`EnrichedLog`	net6.0	114μs	189ns	732ns	0.0569	0	4.28 KB
master	`EnrichedLog`	netcoreapp3.1	119μs	206ns	798ns	0	0	4.28 KB
master	`EnrichedLog`	net472	149μs	72.5ns	271ns	0.668	0.223	4.46 KB
#6381	`EnrichedLog`	net6.0	116μs	129ns	499ns	0.0581	0	4.28 KB
#6381	`EnrichedLog`	netcoreapp3.1	120μs	203ns	788ns	0.06	0	4.28 KB
#6381	`EnrichedLog`	net472	151μs	214ns	827ns	0.683	0.228	4.46 KB

Benchmarks.Trace.NLogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`EnrichedLog`	net6.0	3.01μs	1.07ns	4.13ns	0.0305	2.2 KB
master	`EnrichedLog`	netcoreapp3.1	4.23μs	1.89ns	7.32ns	0.0296	2.2 KB
master	`EnrichedLog`	net472	4.84μs	0.408ns	1.47ns	0.32	2.02 KB
#6381	`EnrichedLog`	net6.0	2.95μs	0.817ns	3.17ns	0.031	2.2 KB
#6381	`EnrichedLog`	netcoreapp3.1	4.13μs	1.82ns	7.06ns	0.0288	2.2 KB
#6381	`EnrichedLog`	net472	4.88μs	1.08ns	4.05ns	0.319	2.02 KB

Benchmarks.Trace.RedisBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`SendReceive`	net6.0	1.33μs	1.18ns	4.41ns	0.0159	1.14 KB
master	`SendReceive`	netcoreapp3.1	1.75μs	5.92ns	22.9ns	0.0158	1.14 KB
master	`SendReceive`	net472	2.12μs	0.722ns	2.8ns	0.183	1.16 KB
#6381	`SendReceive`	net6.0	1.36μs	1.27ns	4.92ns	0.0162	1.14 KB
#6381	`SendReceive`	netcoreapp3.1	1.75μs	5.58ns	21.6ns	0.0155	1.14 KB
#6381	`SendReceive`	net472	2.02μs	3.03ns	11.3ns	0.183	1.16 KB

Benchmarks.Trace.SerilogBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`EnrichedLog`	net6.0	2.71μs	0.648ns	2.34ns	0.0216	1.6 KB
master	`EnrichedLog`	netcoreapp3.1	3.85μs	1.65ns	5.94ns	0.0214	1.65 KB
master	`EnrichedLog`	net472	4.43μs	3.58ns	13.8ns	0.324	2.04 KB
#6381	`EnrichedLog`	net6.0	2.73μs	1.03ns	3.98ns	0.0216	1.6 KB
#6381	`EnrichedLog`	netcoreapp3.1	3.76μs	2.16ns	8.38ns	0.0226	1.65 KB
#6381	`EnrichedLog`	net472	4.52μs	3.16ns	12.2ns	0.322	2.04 KB

Benchmarks.Trace.SpanBenchmark - Slower ⚠️ Same allocations ✔️

Slower ⚠️ in #6381

Benchmark	diff/base	Base Median (ns)	Diff Median (ns)	Modality
Benchmarks.Trace.SpanBenchmark.StartFinishSpan‑net6.0	1.156	400.90	463.55

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`StartFinishSpan`	net6.0	401ns	0.496ns	1.92ns	0.00815	576 B
master	`StartFinishSpan`	netcoreapp3.1	568ns	1.8ns	6.98ns	0.00775	576 B
master	`StartFinishSpan`	net472	604ns	0.974ns	3.77ns	0.0918	578 B
master	`StartFinishScope`	net6.0	477ns	0.702ns	2.72ns	0.00967	696 B
master	`StartFinishScope`	netcoreapp3.1	665ns	1.03ns	3.97ns	0.00935	696 B
master	`StartFinishScope`	net472	874ns	1.41ns	5.47ns	0.104	658 B
#6381	`StartFinishSpan`	net6.0	463ns	0.701ns	2.71ns	0.00804	576 B
#6381	`StartFinishSpan`	netcoreapp3.1	623ns	1.28ns	4.95ns	0.00766	576 B
#6381	`StartFinishSpan`	net472	622ns	0.997ns	3.86ns	0.0916	578 B
#6381	`StartFinishScope`	net6.0	484ns	0.846ns	3.28ns	0.00973	696 B
#6381	`StartFinishScope`	netcoreapp3.1	717ns	1.64ns	6.14ns	0.00932	696 B
#6381	`StartFinishScope`	net472	853ns	1.46ns	5.67ns	0.104	658 B

Benchmarks.Trace.TraceAnnotationsBenchmark - Same speed ✔️ Same allocations ✔️

Raw results

Branch	Method	Toolchain	Mean	StdError	StdDev	Gen 0	Allocated
master	`RunOnMethodBegin`	net6.0	646ns	0.639ns	2.48ns	0.00962	696 B
master	`RunOnMethodBegin`	netcoreapp3.1	905ns	1.14ns	4.43ns	0.00912	696 B
master	`RunOnMethodBegin`	net472	1.11μs	2.23ns	8.64ns	0.104	658 B
#6381	`RunOnMethodBegin`	net6.0	623ns	0.734ns	2.84ns	0.0096	696 B
#6381	`RunOnMethodBegin`	netcoreapp3.1	950ns	1.18ns	4.58ns	0.00907	696 B
#6381	`RunOnMethodBegin`	net472	1.05μs	1.87ns	7.25ns	0.104	658 B

fix typo

andrewlock

LGTM, just some nits

andrewlock · 2025-01-10T10:35:25Z

.azure-pipelines/ultimate-pipeline.yml

@@ -254,6 +254,39 @@ stages:
      displayName: Generate Matrices
      name: generate_variables_step

+- stage: generate_LLM_Report


nit: could we please put this at the bottom of the file, as it's entirely optional? 🥺

.azure-pipelines/ultimate-pipeline.yml

andrewlock · 2025-01-10T10:36:44Z

.azure-pipelines/ultimate-pipeline.yml

+  - template: steps/update-github-status-jobs.yml
+    parameters:
+      jobs: [generate_LLM_job]
+
+  - job: generate_LLM_job
+    timeoutInMinutes: 3
+    dependsOn: []
+    pool:
+      name: azure-windows-scale-set-3
+
+    steps:
+    - template: steps/clone-repo.yml
+      parameters:
+        targetShaId: $(targetShaId)
+        targetBranch: $(targetBranch)
+    - template: steps/install-latest-dotnet-sdk.yml
+
+    - powershell: |
+        tracer/build.ps1 LLMReport
+      displayName: Generate LLM report
+      name: generate_llm_step
+      env:
+        PullRequestNumber: $(System.PullRequest.PullRequestNumber)
+        GITHUB_TOKEN: $(GITHUB_TOKEN)
+        OpenAIKey: $(OPEN_AI_KEY)


This is the wrong value I think, should be OPEN_AI_KEY based on the Nuke code?

It's defined as
[Parameter("An OpenAI key", Name = "OPEN_AI_KEY")]
readonly string OpenAIKey;

It seems to be working...

andrewlock · 2025-01-10T10:38:09Z

tracer/build/_build/Build.GitHub.cs

+        else if (string.IsNullOrEmpty(OpenAIKey))
+        {
+            result = "Null or empty OpenAI key.";
+        }


This won't happen because you marked it Required()

Thanks! Done!

andrewlock · 2025-01-10T10:40:29Z

tracer/build/_build/Build.GitHub.cs

+
+            if (executeLocal)
+            {
+                File.WriteAllText("changes.txt", fullPrompt);


I think you should make this absolute paths, so the location of changes.txt is fixed and in a sensible (i.e. temporary) place. That way it will be excluded by the gitignore (same goes for the LLMResult.txt).

Personally, I'd rather we didn't generate these files, and instead just print them to the console. You can always pipe them to a file if you want to anyway.

Agree. Thanks! Done!

tracer/build/_build/Build.GitHub.cs

andrewlock · 2025-01-10T10:43:10Z

tracer/build/_build/Build.GitHub.cs

@@ -94,6 +99,92 @@ await client.Issue.Update(
            Console.WriteLine($"PR assigned");
        });

+    Target LLMReport => _ => _


Maybe we should rename it so that it's clear it's only a review of a PR? 🤔

I have renamed the target to LLMPRReview. Thanks!

tracer/build/_build/OpenAI/OpenAIAPICall.cs

Co-authored-by: Andrew Lock <[email protected]>

e-n-0

Nice! I'll try it in some of my PRs 😄

e-n-0 · 2025-01-20T12:29:22Z

tracer/build/_build/OpenAI/OpenAIAPICall.cs

+
+        var requestContent = new
+        {
+            model = "gpt-4o",


Is the result really tied to the model used? (like does it do any difference with o1 and the thinking pattern?)
We can maybe in the future specify the openAI model as an arg if other new models get released

Yes, I chose this model as a compromise between results and price, but adding the model as an optional argument could be a nice feature. Thanks!

NachoEchevarria · 2025-01-22T17:21:05Z

Thanks for your feedback and reviews!

NachoEchevarria added 4 commits November 5, 2024 17:40

nacho/OpenAIComments

a5d69d0

Update

08f07c0

Improve calls

b3fb9d2

allow local llm check

fbf1acd

github-actions bot added the area:builds project files, build scripts, pipelines, versioning, releases, packages label Dec 2, 2024

NachoEchevarria and others added 3 commits December 2, 2024 12:29

Merge branch 'master' into nacho/experimentalLlm

6a1980c

test

ec431a9

Update llm workflow

e1ba671

NachoEchevarria and others added 15 commits December 2, 2024 16:15

try pipeline

6c16dbb

fix

894a03e

Ad condition

ee67e90

fix

4957101

fix condition

233f2f4

try again

beabda5

remnove not needed

c0dd2e1

Merge branch 'master' into nacho/experimentalLlm

82262c7

Merge branch 'master' into nacho/experimentalLlm

8da8089

Count tokens

7b101d7

remove not needed parameter

9c20fea

Exclude description

8022940

Add exclusions

46a9ff4

Update Build.GitHub.cs

f8b5741

fix typo

remove not needed

baba996

DataDog deleted a comment from github-actions bot Dec 12, 2024

NachoEchevarria marked this pull request as ready for review December 12, 2024 16:47

NachoEchevarria requested a review from a team as a code owner December 12, 2024 16:47

NachoEchevarria changed the title ~~Nacho/experimental llm~~ LLM PR reviewer Dec 12, 2024

Merge branch 'master' into nacho/experimentalLlm

c6d3b1b

andrewlock approved these changes Jan 10, 2025

View reviewed changes

NachoEchevarria and others added 7 commits January 14, 2025 12:27

Update .azure-pipelines/ultimate-pipeline.yml

fe99e70

Co-authored-by: Andrew Lock <[email protected]>

Move generate_LLM_Report to the bottom of the file

0e658b5

Update tracer/build/_build/OpenAI/OpenAIAPICall.cs

a578c22

Co-authored-by: Andrew Lock <[email protected]>

Update tracer/build/_build/Build.GitHub.cs

709ebdb

Co-authored-by: Andrew Lock <[email protected]>

Write output in console instead of file. Fix indents.

31fa7b9

Rename stage to LLMPRReview

dc8a352

Merge branch 'master' into nacho/experimentalLlm

da1a53d

DataDog deleted a comment from andrewlock Jan 20, 2025

e-n-0 approved these changes Jan 20, 2025

View reviewed changes

e-n-0 reviewed Jan 20, 2025

View reviewed changes

Merge branch 'master' into nacho/experimentalLlm

820ed6d

NachoEchevarria merged commit fe5fbb5 into master Jan 22, 2025
125 of 127 checks passed

NachoEchevarria deleted the nacho/experimentalLlm branch January 22, 2025 17:21

github-actions bot added this to the vNext-v3 milestone Jan 22, 2025

LLM PR reviewer #6381

LLM PR reviewer #6381

Conversation

NachoEchevarria commented Dec 2, 2024 • edited Loading

Summary of changes

Reason for change

Implementation details

Test coverage

Other details

datadog-ddstaging bot commented Dec 2, 2024 • edited Loading

Datadog Report

New Flaky Tests (1)

andrewlock commented Dec 2, 2024 • edited Loading

Execution-Time Benchmarks Report ⏱️

andrewlock commented Dec 2, 2024 • edited Loading

Throughput/Crank Report ⚡

andrewlock commented Dec 2, 2024 • edited Loading

Benchmarks Report for tracer 🐌

Benchmark details

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Raw results

Slower ⚠️ in #6381

Raw results

Raw results

andrewlock left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e-n-0 left a comment

Choose a reason for hiding this comment

e-n-0 Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NachoEchevarria commented Jan 22, 2025

NachoEchevarria commented Dec 2, 2024 •

edited

Loading

datadog-ddstaging bot commented Dec 2, 2024 •

edited

Loading

andrewlock commented Dec 2, 2024 •

edited

Loading

andrewlock commented Dec 2, 2024 •

edited

Loading

andrewlock commented Dec 2, 2024 •

edited

Loading

e-n-0 Jan 20, 2025 •

edited

Loading