Working on Var #1

icemelon · 2016-10-13T03:29:29Z

No description provided.

* updates (#1) * add scalars * change format * change inferattr interface * remove scalar * remove warning

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Import & Cache Mechanism (apache#26) [BugFix] Fix Winograd Test Script (apache#25) Add task extraction & caching (apache#27) A few fixes for task extraction (apache#28) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>

[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485) [Meta Schedule][M3c] PostOrderApply (apache#486) Fix Post Order Apply (apache#490) [MetaSchedule] Relay Integration (apache#489) [M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492) Fix replay trace. (apache#493) [M3c][Meta Schedule] Implement the Replay Func class. (apache#495) [PR] Test script for meta-schedule task extraction. Interface to load… (apache#494) [Meta Schedule Refactor] Get child blocks (apache#500) Read-at && Write-at (apache#497) [M3c][Meta Schedule] Measure Callbacks (apache#498) [Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496) [MetaSchedule] Sample-Perfect-Tile (apache#501) [MetaSchedule] TE Workloads (apache#502) [TensorIR] GetProducer, GetConsumer (apache#506) [MetaScheduleRefactor] Annotate&Unannotate (apache#505) [MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503) [Tests] Add unittests for auto-inline and multi-level-tiling (apache#508) [Meta Schedule] Minor Fixes (apache#507) [MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509) [MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499) [Meta Schedule] Add Helper Function & Minor Modification (apache#512) [MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll (apache#513) [Meta Schedule] Feature Extractor & Cost Model (apache#510) Blockize & Tensorize (apache#514) Layout Rewriting: Suggest-Index-Map (apache#520) [MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516) [Meta Schedule] Per-Store-Feature (apache#521) Add traced schedule for blockize & tensorize (apache#526) [Meta Schedule] Add XGBoost Model & Random Model (apache#519) User-Interface: Tune-TIR (apache#525) User-Interface: Tune-TE (apache#527) [Minor] More logging on python (apache#528) Get CUDA tuning working (apache#529) [MetaSchedule] TensorRT BYOC (apache#518) [BugFix] LocalBuilder API (apache#531) [Meta Schedule] Add Cost Model Update Measure Callback (apache#530) [Bugfix] BuilderInput with default params (apache#532) [MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534) [Meta Schedule] Evolutionary Search (apache#522) [BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535) [Meta Schedule] Fix some bugs (apache#537) Initiate Experiments for CPU Performance Alignment with Ansor (apache#538) [Meta Schedule] Tweak experiment scripts (apache#539) [Meta Schedule] Initiate experiments on CUDA (apache#540) [TIR][Schedule] Buffer transform (apache#523) Auto Tensor Core (apache#524) Working on Evo Search (apache#542) [Meta Schedule] Add Replay Tuning Interface (apache#543) Evolutionary Search on CPU (apache#544) Misc improvement over the error message (apache#545) [TIR][Schedule] Software pipelining (apache#533) [Meta Schedule Refactor] fixing unit tests (apache#547) [MetaSchedule] Mutator-Compute-Location (apache#548) Misc Improvement of Evolutionary Search (apache#549) Hotfix for software pipeline (apache#552) Misc Improvement (apache#550) [Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555) Rule RFactor (apache#551) [MemHammer] Rewrite Rules (apache#554) [MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556) [MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559) [MetaSchedule] Perf Alignment - NRM on CUDA (apache#560) [TIR] Reorder the block iters of the blocks generated by RFactor (apache#561) Removing 2 unit tests for software pipelining (apache#562) [MemHammer] Lower Pass + Unittests (apache#557) Perf Align: Remove Auto-inline before Multi-level-tiling (apache#564) Fix Sketch Generation Unittests (apache#565) speed up VerifyGpuCode (apache#568) [Performance Align] fixing codegen problems (apache#569) [Meta schedule] improve search space (apache#1) Hot fix for bound predicate (apache#3) [Meta Schedule] Update Tune Relay (apache#4) [Performance Align] fixing codegen problems (apache#5) [PerfAlign] NRM & SFM on Raspi Aligned (apache#6) [BugFix] Apply bound predicate directly to loops when possible (apache#12) [BugFix] Fix CrossThreadReduction on CUDA (apache#13) [MetaSchedule] Enable BertTuning with MetaScheduler (apache#11) [Minor][MemHammer] Minor tweaks in code review (apache#14) [Meta Schedule] Add customizable search space to PostOrderApply. (apache#16) Fix cooperative fetching (apache#17) Fixes for codegen (apache#18) [Hotfix] A unittest (apache#19) Fix for GRP sketch gen (apache#21) Add threadIdx filtering in Multi-Level-Tiling and Verify-GPU-Code (apache#20) [BugFix][TIR] Fix cross-thread reduction when single reduction loop with predicate (apache#10016) (apache#22) [MemHammer][Refactor] Code Review (apache#15) [Meta Schedule] Add Winograd Test for Customizable Search Space (apache#24) Co-authored-by: Siyuan Feng <[email protected]> Co-authored-by: Bohan Hou <[email protected]> Co-authored-by: Hongyi Jin <[email protected]> Co-authored-by: Ruihang Lai <[email protected]> Co-authored-by: Junru Shao <[email protected]> Co-authored-by: Wuwei Lin <[email protected]> Co-authored-by: Sunghyun Park <[email protected]> Co-authored-by: Xiyou Zhou <[email protected]>

[SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <[email protected]> Fix AxisTree (apache#3) * fix axis tree * upd [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr [BugFix] Fix binary search & SpIterVar (apache#7) [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting [SparseTIR] Index Lowering (apache#8) * Add StmtFunctor/ExprFunctor for SparseBufferStore/Load * Add basic index lowering * Finish index lowering (maybe) * Address comments * Convert CRLF to LF Frontend update, demo scripts. (apache#10) * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <[email protected]> * Fix AxisTree (apache#3) * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * fix axis tree * upd * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <[email protected]> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * Format and Buffer data structure (apache#1) * [SparseTIR] Constructors and Python Interface for `Axis` and `SparseBuffer` (apache#2) * add methods for Object * axis constructors * methods for SparseBuffer * put into registry * python interface * [CherryPick][Intrinsic] lower_bound and upper_bound for binary search in Sparse TIR. (apache#483) (apache#4) * upd * upd * fix * upd * upd * upd * upd * upd * fix * upd * upd * upd * upd * upd * upd * upd * codegen-rule * upd * upd * test * upd * fix * two arguments Co-authored-by: Zihao Ye <[email protected]> * Fix AxisTree (apache#3) * fix axis tree * upd * [SparseTIR] Add SparseBufferLoad/SparseBufferStore (apache#5) * Add dtype for SparseBuffer * Add name for SparseBuffer. Remove `ndim` * Remove namespace sparse * Add SparseBufferLoad/Store * Add method `ndim()` * [SparseTIR] Introduce SpIterVar (apache#6) * [SparseTIR] Introduce SpIterVar * Add conversion to PrimExpr * [BugFix] Fix binary search & SpIterVar (apache#7) * [BugFix] Add field `is_reduction` for SpIterVar (apache#9) * [BugFix] Add field `is_reduction` for SpIterVar * Formatting * upd * upd Co-authored-by: Ruihang Lai <[email protected]> [SparseTIR] SparseBlock on C++/Python side (apache#11) * Fix a bug in the last commit * SparseBlock on C++ & Python side [BugFix][SparseTIR] TVMScript Parser for Axis & SpIterVar (apache#12) * Update `cord` and `pos` * Fix `idtype` * Formatting.. * Bug fix 1 * Move new special stmts * Parser for Axis and SpIterVar * Fix context_maintainer.py [SparseTIR] Enhance SparseBlock to contain enough PrimFunc information (apache#13) * Enhance SparseBlock to have enough PrimFunc info * Remove `func_sparse_buffer_map_` * Don't print the map uh-huh [SparseTIR] Parser, Printer, Roundtrip (apache#14) * SparseBlock scope handler (part 1) * SparseBlock scope handler (part 2) * SparseBlock scope handler (part 3) * SparseBlock scope handler (fix 1) * Add SparseBufferLoad/Store on Python side * Parser for SparseBufferLoad/Store * Add SparseBlock to Python __init__ * StmtFunctor for SparseBlock * Ensure at least one dimension for SparseBuffer * Make `axis` field of SpIterVar mandatory * SparseBlock scope handler (fix 2) * Update Axis syntax by removing `name` parameter * Move to intrin.py * Add filed `from_sparse` to DenseFixedAxis * SparseTIR script printer * Roundtrip test * `update_symbol` bug fix * Fix attr visit in SparseBuffer * Define then compare in SparseBlock * Fix printer bug for SparseBuffer * Enable graph match for Axis and SparseBuffer * Complete HashReduce and EqualReduce for AxisTree and SparseBuffer * Fix typo * Rename test * Bug fix 1 * Bug fix 2 * Add more tests Move tests (apache#15) [SparseTIR] ReprPrinter for Axis and SpIterVar (apache#16) upd (apache#17) flatten (apache#18) ELL and BSR correctness test scripts (apache#19) [SparseTIR] SparseTIR Lowering (apache#20) * Fix a previous bug of sparse-fixed SpIterVar creation * Fix a previous bug in `GetDenseValue` * Refactor Collector and IndexTransformer * Construct block and loops * Fix a previous bug which rejects DV iters in collector * Update buffer map * Create root block * Fix bug of sparse-fixed SpIterVar creation * Fix bug on SpIterVar conversion (with refactor) * Fix bug when getting dependent SpIterVars * Fix bug on dependency map and index lowering * Full block read/write region * Test version 1 * Fix bug of loop order * Fix bug of batch-mm iterator ordering * Update PrimFunc args to use symbolic params * Fix bug of test "csr_element_wise" * Fix bug of index accumulation for sparse-fixed axis * Update correctness test * Test structural equality * Refactor and use Array fix nnz cols Add docstring for sparse tir lowering (apache#21) * add docstring * upd Add more examples part 1 (sddmm) (apache#22) * upd * upd * upd [SparseTIR][Schedule] SparseBlockRV, GetSparseBlock, SparseReorder (apache#23) * Test initialization * Fix a stupid bug of ReprPrinter * Add SparseBlockRV * Schedule: GetSparseBlock * Schedule: Reorder [SparseTIR][Schedule] GetSpIters (apache#24) remove hybrid script for successful compilation Add atomic intrinsic for output nonzero inference. (apache#25) * upd * upd Add "sparse" block attribute. (apache#26) Revert "remove hybrid script for successful compilation" This reverts commit eebd7c1. [SparseTIR] Hack `IsAffineBinding` check (apache#27) * [TensorIR][Schedule] Inherit block anotation upon creating new blocks * Fix SDDMM test * Hack IsAffineBinding for sparse blocks Axis Dependency Tree aware code-gen and bmm example (apache#28) * upd * upd * upd * upd * upd * upd * upd * upd * remove redundancy * fix * upd * upd Re-design Indices lowering (apache#29) * upd * upd * upd * upd * upd * init * format * fix * revise coding-style * format Complete indices lowering (apache#30) * upd * upd * upd * done * upd * passed test * upd Add more docstrings and depress warnings for new lowering algorithm. (apache#31) Refactor derived axis, frontend support of fusion. (apache#32) * upd * upd * fix Fatal bugfix and change the signature of DenseVariableAxis. (apache#33) Syntax simplification (apache#34) Change the order of generated blocks for block isolation. (apache#35) * upd * upd * upd Syntax of AttachAxis for BMM (apache#36) * upd * upd * upd [SparseTIR] Add "square sum" lowering test (apache#37) * Add square sum test * Remove pylint comment [BugFix] Fix offset caching in lowering (apache#38) * Hack compact dataflow check in a dirty way * Add two-K square sum test * Mark skipped tests * Fix offset saving in lowering Fusion syntax fix + SDDMM example. (apache#39) Some structure change on update offsets. (apache#40) [Refactor] SparseTIR Lowering (apache#41) * Take out methods in Scope * Refactor * Refactor "match" * Tweak scope contents * Refactor ViewIndexInAxis * Refactor Scope * SDDMM tests under implementation * Refactor block stack * Use Map for var_map * Extract NeedCreateNewBlock * Simplify SpIterVarToIterVar via GetIterExtent * Refactor NeedCreateNewBlock * Add docstring * Use "auto" correctly * Minor refactor and use some move Remove redundant analyzers (apache#42) Support indices lowering for attach and fuse. (apache#43) * upd * upd * upd Fix irregular BMM example. (apache#44) * upd * upd * upd * upd RGCN forward and butterfly pattern example. (apache#45) Fused SDDMM example. (apache#46) * upd * wip * fix Fix sparse reorder after refactor (apache#47) [Refactor] Refactor Unittest (apache#48) * upd * remove redundancy [Unittest] Correctness test for benchmarking scripts (apache#49) Bugfix and more test for axis fusion, new workload (apache#50) * upd * upd upd

Add debug print for Visual Studio immediate window.

[Relax][AOT] Add AOTMemoryLower pass when USMP is disabled

…15483) * [Script] Be more careful when generating ast.ExtSlice for Subscript The ast.ExtSlice expects a non-empty list, otherwise evaluation fails with "error: empty dims on ExtSlice". Also, each element in "dims" list of ExtSlice must be either Slice or Index. In python3.8 an expression A[()] is parsed (by ast) as Subscript with slice being Index(value=Tuple(elts=[])). When we translate a subscript from doc.AST to ast, we unconditionally convert every tuple to ast.ExtSlice, which in this case is incorrect. The fix is to map empty tuple back to the Index(Tuple[])) instead of ExtSlice. In other cases, ensure that members of ExtSlice are of correct types. * Fix lint #1

Updated SIMA.md to make sure we could build and link TVM correctly on

Merge code into Tilelang

Add kv transfer kernel

* base tuner * gpu schedule * matmul ops * initial commit * refactor fast dlight to bit blas * support i8 swizzle * int8xint2 gemm * update keep * update lop3 cpp test * all low int to float16 convert * int8 fast decoding * float16with scale * annotate tc layout propa * impl tir interleve test * impl interleave weight. * weight only propagation * support layout propagate recover schedule of dequantize. * refactor testing * enhance gemv schedule for dynamic * dequantize matmul initilization * [refactor] move comments to BitBLAS * evaluate pytorch integeration * evaluate correctness of weight only decode * annotate mit license * annotate apache/mit lisence * init logger * refactor ops test with pytest * ladder_permutate implementation * append tvm third party lisence * scaling ladder permutate impl * add storage dtype test * implement lop3 permutation ops and related test * support with propagate layout. * update tvm lisence * disable fmt in pytest * implement cpu arch for consistency * seperate gemv schedule and gemv_dequantize schedule. * fix typo * refactor quantization * init testing. * refactor matmul and operators * append dequantize and test items * reslove lisence related items * refactor implementation * init read me. * integration with faster transform imp * integerate bug fix. * update ignore * improve code structure. * update mit lisence * remove gitkeep file * provide simple tir benchmark result. * enhance build * auto layout deduce * fix default tensorize. * update ReadMe * update readme * update read me * update readme * simple fix * readme fix

* Checkpoint, nothing works * DNNL based codegen almost works * Work in dnnl style * Work in dnnl style * Arg passing works * Work in dnnl style * Codegen somewhat works * Requantization not working * Codegen works * Remove headsail_old

tqchen closed this as completed Oct 13, 2016

haolongzhangm mentioned this issue Nov 15, 2017

connect to this proxy server via TVM RPC application failed #638

Closed

tqchen referenced this issue in tqchen/tvm May 26, 2018

[PASS] Add save/load json (#1)

625ab2c

tqchen referenced this issue in tqchen/tvm May 26, 2018

update (apache#26)

034afc6

* updates (#1) * add scalars * change format * change inferattr interface * remove scalar * remove warning

tqchen added a commit that referenced this issue May 29, 2018

[PASS] Add save/load json (#1)

3c1ac2a

tqchen added a commit that referenced this issue May 29, 2018

update (#26)

c362a28

* updates (#1) * add scalars * change format * change inferattr interface * remove scalar * remove warning

tqchen referenced this issue in tqchen/tvm Jul 6, 2018

[PASS] Add save/load json (#1)

5d40732

tqchen referenced this issue in tqchen/tvm Jul 6, 2018

update (apache#26)

6ffeae9

* updates (#1) * add scalars * change format * change inferattr interface * remove scalar * remove warning

vinx13 referenced this issue in vinx13/tvm Jan 24, 2019

[TOPI][CUDA] Roi align v2 (#1)

2514df2

MarisaKirisame referenced this issue in MarisaKirisame/tvm Mar 29, 2019

Add memory manager (#1)

c1aee56

icemelon referenced this issue in icemelon/tvm Apr 4, 2019

Add memory manager (#1)

32c7bcd

icemelon referenced this issue in icemelon/tvm Apr 15, 2019

Add memory manager (#1)

becdbb1

icemelon referenced this issue in icemelon/tvm Apr 16, 2019

Add memory manager (#1)

e5a4715

kovasb mentioned this issue Apr 20, 2019

[RFC] Support Tensorflow Op Bridge #3059

Closed

u99127 mentioned this issue Jun 1, 2019

Implementation of uTVM #3227

Merged

weberlo added a commit to weberlo/tvm that referenced this issue Jun 13, 2019

Add MicroTVM tutorial patch (apache#1)

d9116dd

weberlo added a commit to weberlo/tvm that referenced this issue Jun 17, 2019

Add MicroTVM tutorial patch (apache#1)

5bf488f

weberlo added a commit to weberlo/tvm that referenced this issue Jun 18, 2019

Add MicroTVM tutorial patch (apache#1)

d16af94

weberlo added a commit to weberlo/tvm that referenced this issue Jun 19, 2019

Add MicroTVM tutorial patch (apache#1)

06658bb

huajsj mentioned this issue Jun 20, 2019

[VTA] Fix VTA function Vivado Compile Error. #3375

Merged

sherwinkh mentioned this issue Jun 25, 2019

Python3 import mxnet and nnvm.compiler caused core dumped #3431

Closed

hlq1025 mentioned this issue Aug 1, 2019

add new operator to onnx.py #3688

Closed

jdomke mentioned this issue Aug 23, 2019

[autotvm] runtime errors for simple matmul example with double matrix size #3823

Closed

Ruinhuang mentioned this issue Sep 13, 2019

[Relay] CUDA_ERROR_INVALID_VALUE occurs when testing conv2d_grad #3950

Closed

AnneChen mentioned this issue Oct 16, 2019

NNVM Error #4137

Closed

This was referenced Jan 8, 2020

convert onnx to tvm model error #4654

Closed

convert keras model to tvm error #4655

Closed

Airtnp mentioned this issue Jan 15, 2020

[AutoTVM] Failed to run autotvm example on GEMM #4717

Closed

kevinyuan mentioned this issue Jan 22, 2020

[VTA] Fix an issue in updating uop_idx in the TensorGemm module #4694

Merged

expectopatronm mentioned this issue Jan 30, 2020

TVMError: src/runtime/cuda/cuda_module.cc:93: CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_INVALID_PTX #1027

Closed

prateek9623 pushed a commit to prateek9623/tvm that referenced this issue May 1, 2022

Merge pull request apache#1 from microsoft/kedeng/dbg

b625640

Add debug print for Visual Studio immediate window.

zhiwei-dong mentioned this issue May 19, 2022

[Bug] Can't run prequantized tflite model through MicroTVM #11371

Open

cgerum mentioned this issue May 20, 2022

[Bug] INT8 cuda kernels fail for older GPU versions #11388

Closed

chayliu1991 mentioned this issue Jun 2, 2022

[Bug] can not runandroid_rpc_test.py #11536

Open

Icemist mentioned this issue Jun 2, 2022

Add cooldown interval logic for the profiling functional #11465

Merged

alexandrepires5 mentioned this issue Jun 19, 2022

[Bug] YOLOX Autotune error - RuntimeError: Invalid type of axis: <class 'tvm.tir.expr.SizeVar'> #11780

Closed

huajsj mentioned this issue Jun 24, 2022

[Bugfix][Runtime] Fix sched_setaffinity in Android #11599

Open

DikshanyaRam mentioned this issue Nov 2, 2022

[Bug] RPCTimeEvaluator signature error #12982

Closed

mehrdadh referenced this issue in mehrdadh/tvm Dec 7, 2022

Merge pull request #1 from gigiblender/aot-mem-lower

66eae17

[Relax][AOT] Add AOTMemoryLower pass when USMP is disabled

july8023 mentioned this issue Jan 11, 2023

[Bug] a bug about onnx aten::index_put. #13759

Open

elvin-n mentioned this issue Feb 1, 2023

[TOPHUB] use keys as a keyword for searching of existing statistics #13874

Merged

hcms1994 mentioned this issue Apr 25, 2023

[Bug] Check failed: type_code_ == kTVMObjectHandle (0 vs. 8) : expected Object but got int #14717

Open

jikechao mentioned this issue May 7, 2023

[Bug] [torch] Expected Array[PrimExpr], but got Array[index 1: relay.Constant] #14794

Closed

kparzysz-quic pushed a commit to kparzysz-quic/tvm that referenced this issue Aug 4, 2023

Fix lint apache#1

fd6b7ac

tingyuzaixiao mentioned this issue Aug 9, 2023

Check failed: (!axes.defined() || static_cast<int>(axes.size()) == ndim) is false: Dimensionmismatch: axes has 2 elements, but data.ndim = 4 #15514

Closed

mikeseven pushed a commit to mikeseven/tvm that referenced this issue Sep 27, 2023

Merged in documents/sima.md (pull request apache#1)

1ca5de0

Updated SIMA.md to make sure we could build and link TVM correctly on

bbbeomjin mentioned this issue Nov 2, 2023

[Bug] InternalError: Check failed: (target_has_feature_fn_ptr) is false: Function target.target_has_feature not found #16037

Open

cbalint13 mentioned this issue Mar 27, 2024

[Target] Use LLVM target parser for determining Arm(R) A-Profile Architecture features #16425

Merged

LeiWang1999 referenced this issue in TileLang/tvm Oct 5, 2024

Merge pull request #1 from TileLang/cy_merge

ef0837f

Merge code into Tilelang

MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this issue Oct 26, 2024

Merge pull request apache#1 from cmu-catalyst/disaggregation

6e3d8b9

Add kv transfer kernel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Working on Var #1

Working on Var #1

icemelon commented Oct 13, 2016

Working on Var #1

Working on Var #1

Comments

icemelon commented Oct 13, 2016