microsoft/Accera
Open source cross-platform compiler for compute-intensive loops used in AI algorithms, from Microsoft Research
29 Releases
Latest: 3y ago
v1.2.29Latest
📋 What's Changed
- ------------
- Merged PR 3211: Upgrade hatlib dependency to 0.0.39. [Captain Jack
- Sparrow]
- Upgrade hatlib dependency to 0.0.39
- Merged PR 3209: Support AffineParallelOp and scf::ParallelOp in
- RangeValue utils. [Mason Remy]
- Support AffineParallelOp and scf::ParallelOp in RangeValue utils
- Merged PR 3207: Fix parallelization and enable file checker in tests.
- + 29 more
v1.2.28
📋 What's Changed
- Merged PR 3199: Rename _slice to slice and add docs. [Captain Jack
- Sparrow]
- Rename _slice to slice and add docs
- Merged PR 3197: Preserve dest memref shape during SliceOp to SubViewOp
- lowering. [Captain Jack Sparrow]
- Preserve dest memref shape during SliceOp to SubViewOp lowering:
- Without this change, subview op would discard the dest memref type required by the slice op. For example,
- ```
- + 52 more
v1.2.27
📋 Changes
- Merged PR 3181: Fix bug with reinterpret_cast of partially-dynamic
- Merged PR 3180: Enable getting a memref shape from a memref_cast
- Merged PR 3179: Fix vulkan-specific smoke test break. [Lisa Ong]
v1.2.26
📋 Changes
- Merged PR 3176: [Accera] split_dim op supports dynamic dims with
- Merged PR 3174: Ensure any dynamic allocations are heap allocs that
- Merged PR 3171: [test] Add some tests for Dimensions. [Kern Handa]
- Merged PR 3175: Support reinterpret cast of same bitwidth without
- Merged PR 3167: Remove hack to treat INPUT_OUTPUT Arrays with shape
- Merged PR 3165: [build] Fix clang 14 release build warnings treated as
- Merged PR 3162: Bump vcpkg to latest release. [Lisa Ong]
- Merged PR 3161: Fix cache reduce scale constant hoisting. [Mason Remy]
- + 1 more
v1.2.25
📋 Changes
- Merged PR 3160: [security] bump onnx to 1.13.0. [Lisa Ong]
- Merged PR 3157: Dynamic split dim tests. [Mason Remy]
- Merged PR 3158: Do not unroll the profiling ops when vectorization
- Merged PR 3153: Fix the lowering issue of the profiling ops. [Denny
- Merged PR 3152: [nfc] [test] Skip fast_exp mlas tests on unsupported
v1.2.24
📋 Changes
- Merged PR 3150: Change high precision fp to not perform contraction.
- Merged PR 3147: Fix vector cast with same bitwidth. [Mason Remy]
- Merged PR 3149: Improve 1-D horizontal sum reductions for 8xf32 and
- Merged PR 3148: Adds Package level FP precision override. [Kern Handa]
- Merged PR 3144: Removes fp precision as an option for Package.build.
- Merged PR 3143: Add dsl test for profiling op. [Denny Sun]
- Merged PR 3022: Assert the arg order in debug mode. [Denny Sun]
- Merged PR 3137: expose profiling function to DSL. [Denny Sun]
- + 10 more
v1.2.23
📋 Changes
- Merged PR 3131: Set masked load/store inbounds flag to true. [Mason
- Merged PR 3130: Recognize and simplify always true EQ and NE CmpOps.
- Merged PR 3129: Optimize 1-row horizontal i16->i32 sum reduction.
- Merged PR 3118: vectorize accumulation of results of two masked load
- Merged PR 3126: [test] Adds more tests for vectorized transpose. [Kern
- Merged PR 3121: [nfc] Separate bounds checking into separate pass
- Merged PR 3122: Fix reinterpret_cast output memref shape. [Mason Remy]
- Merged PR 3115: Normalize AffineForOps to have unit stride and begin
- + 2 more
v1.2.22
📋 Changes
- Merged PR 3107: Make vectorization happen after inlining and
- Merged PR 3108: extend vectorization for masked store case. [JUBI
- Merged PR 3109: Set conan version < 2.0.0. [Mason Remy]
- Merged PR 3104: Position fusing dim after the fused dimensions.
- Merged PR 3096: Add "RelWithDebInfo"-like option to accc. [Chuck
v1.2.21
📋 Changes
- Merged PR 3101: [build] install pkg-config for macos buddy builds.
- Merged PR 3098: [nfc] Move vectorization code to separate files.
- Merged PR 3100: Adds CMake dependencies to acc-translate to ensure
- Merged PR 3095: Remove duplicate SubArray class. [Mason Remy]
- Merged PR 3073: vectorize masked load store. [JUBI TANEJA]
- Merged PR 3093: Add meaningful error messages for c++ exceptions.
- Merged PR 3092: Add type size getter utility. [Captain Jack Sparrow]
- Merged PR 3074: Add rudimentary pass to fix redundant load/store
- + 13 more
v1.2.20
📋 What's Changed
- Merged PR 3070: Fixes for sub_array and _split_dimension [Mason Remy]
- Fixes for sub_array and _split_dimension
- This fixes the sub array and split dim ops to work with the accera
- codebase that has updated around them. Some MemoryLayout assumptions are
- getting in the way and have been disabled in the short-term, however
- long term our memory layout behavior should more closely match what MLIR
- affine maps can represent for more generalized dynamic support
- Merged PR 3063: Refactor Dimension with C++ backend container class and few other fixes [Captain Jack Sparrow]
- + 6 more
v1.2.19
📋 Changes
- Merged PR 3069: Set target device features on module and check when
- Merged PR 3060: Adds support for sqrt op in acc-translate. [Kern
v1.2.18
📋 Changes
- Merged PR 3055: Move value unrolling to after function inlining and
- Merged PR 3053: Add package.build flags for building with higher-
- Merged PR 3052: Place heap allocations at the top level of the
- Merged PR 3050: [non-func, API] Change Nest.get_shape() to always
- Merged PR 3030: Include acc-translate whenever accera is installed.
- Merged PR 3035: [nfc] Adds my machine to targets.py. [Kern Handa]
v1.2.17
📋 What's Changed
- Merged PR 3029: Work around constraint resolution issues with dynamic
- split size 1. [Mason Remy]
- Work around constraint resolution issues with dynamic split size 1
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.16...v1.2.17
v1.2.16
📋 What's Changed
- Merged PR 3027: Hack required to use Array as output element argument
- (Dimension) [Captain Jack Sparrow]
- Merged PR 3025: Add arg name and size string required for hat
- metadata. [Captain Jack Sparrow]
- Add arg name and size string required for hat metadata
- Merged PR 3017: Output array supports gather function. [Denny Sun]
- Add the dsl test for gather function.
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.15...v1.2.16
v1.2.15
📋 What's Changed
- Merged PR 3018: Use VS 17.4.3-built binaries. This is in a separate
- channel to allow older ve... [Mason Remy]
- Use VS 17.4.3-built binaries. This is in a separate channel to allow older versions to keep working
- Merged PR 3012: Correctness check for output array support for range
- node. [Denny Sun]
- Successful correctness check means output array support can work end to end.
- Merged PR 3015: Update hatlib version to support floating type as
- function arg. [Denny Sun]
- + 13 more
v1.2.14
📋 Changes
- Merged PR 3001: [test] Expect failures on macos for x86 intrinsics
- Merged PR 3000: Expect failures for macos in vpmaddwd tests. [Lisa
- Merged PR 2994: Bump hatlib to 0.0.32. [Lisa Ong]
- Merged PR 2997: Support more casting cases in vpmaddwd matcher. [Mason
- Merged PR 2996: [release] bump docs to 1.2.14 for next release. [Lisa
v1.2.13
📋 Changes
- Merged PR 2987: Add support for max/min/round ops and vectorizing
- Merged PR 2963: Control TEMP array allocation location. [Mason Remy]
- Merged PR 2962: Expand vpmaddwd matching and add intrinsic call.
- Merged PR 2961: Match more vectorization patterns and support
- 2-loop interleaving store -> vector shuffle and store
- simple horizontal reductions (not always efficient currently)
- vectorized casts
- Merged PR 2960: Enable marking functions as no-inline-into. [Mason
- + 10 more
v1.2.12
📋 What's Changed
- Merged PR 2953: Workaround debug mode failures with dimension argument
- ordering. [Lisa Ong]
- Order dimension arguments after Array args to avoid this lowering issue in Debug mode (until Debug mode is fixed)
- ```
- %42 = llvm.insertvalue %7, %41[3, 0] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
- ^
- %7 = llvm.insertvalue %arg6, %6[4, 1] : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<2 x i64>, array<2 x i64>)>
- ^
- + 61 more
v1.2.11
📋 What's Changed
- Update vcpkg by @AtariDreams in https://github.com/microsoft/Accera/pull/52
- Merged PR 2924: Update hatlib dependency in setup.cfg, add comment.
- [Lisa Ong]
- Merged PR 2922: [Github] Update vcpkg. [Lisa Ong]
- From c2177e64bdf05e7ea69a8d41f420fcdb42df49db Mon Sep 17 00:00:00 2001
- Merged PR 2910: Updates hatlib dependency to 0.0.29. [Kern Handa]
- Merged PR 2905: Fix internal param name in GPU benchmarks. [Captain
- Jack Sparrow]
- + 47 more
✨ New Contributors
- @AtariDreams made their first contribution in https://github.com/microsoft/Accera/pull/52
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.10...v1.2.11
v1.2.10
📋 What's Changed
- Update ci.yml to fix path changes by @lisaong in https://github.com/microsoft/Accera/pull/49
- Add unrolled convolution case study link by @marina-neseem in https://github.com/microsoft/Accera/pull/50
- Bump protobuf from 3.20.1 to 3.20.2 in /accera/onnx-emitter/test by @dependabot in https://github.com/microsoft/Accera/pull/51
- Merged PR 2886: [release] Bump docs to 1.2.10, sync GH to ADO. [Lisa
- Ong]
- Bulk docs version update
- Bump protobuf from 3.20.1 to 3.20.2 in /accera/onnx-emitter/test (d1b87ec6820417da0e86d60094eacf354398d4b9)
- Also fixing a minor docs bug (errant backtick)
- + 74 more
v1.2.9
📋 Changes
- Merged PR 2862: write runtime size of index type to Hat. [Denny Sun]
- Merged PR 2861: Fix cache_C benchmark variable which is not getting
- Merged PR 2864: [build]: fix breaks due to agent image updates. [Lisa
- Only pipelines that rely on azure build agents are affected.
- We still need to keep the submodule around to enable external builds from the Github repo.
- Remove defunct pipeline
- Update vcpkg submodule while we're here
- Merged PR 2839: Enable CUDA output caching. [Ritwik Das]
- + 37 more
v1.2.8
📋 What's Changed
- Set license field in metadata of package by @tonybaloney in https://github.com/microsoft/Accera/pull/46
- Github codespaces configuration by @lisaong in https://github.com/microsoft/Accera/pull/48
- ------------
- Merged PR 2814: Parameterize batch_size in GPU benchmarks. [Ritwik
- Das]
- Parameterize batch_size in GPU benchmarks
- Merged PR 2810: [release] [nfc] Bump docs version to 1.2.8, bump
- github actions to llvm 14.0.6. [Lisa Ong]
- + 189 more
✨ New Contributors
- @tonybaloney made their first contribution in https://github.com/microsoft/Accera/pull/46
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.7...v1.2.8
v1.2.7
📋 Changes
- Merged PR 2744: [doc] Fixes link in reference/functions/cast.md, revs
- Merged PR 2743: [DSL] Document implicit casting rules and the explicit
- Document implicit casting rules implemented by !2693
- Promote `acc.cast` to a documented function to give the user control to override implicit casting behavior
- Merged PR 2739: Updates ROCM tensorization pattern to handle casting.
- Merged PR 2643: Some fixes for last major array caching in
- Merged PR 2693: Updates DSL codegen to implicitly cast if possible.
- Merged PR 2735: Pass multiple input files as comma-separated list to
- + 11 more
v1.2.6
📋 What's Changed
- Bump urllib3 from 1.25.8 to 1.26.5 in /tools/benchmarkers by @dependabot in https://github.com/microsoft/Accera/pull/42
- [ci] Fix out of disk space errors for CI workflow by @lisaong in https://github.com/microsoft/Accera/pull/43
- Bump bottle from 0.12.19 to 0.12.20 in /tools/viz by @dependabot in https://github.com/microsoft/Accera/pull/44
- Merged PR 2657: Add conversion pass from gpu ops to rocdl ops. [Ritwik Das]
- switch to gpu dialect for gpu index ops
- add conversion pass from gpu dialect to rocdl
- Merged PR 2652: Add integer tensor ops support for AMD targets.
- [Ritwik Das]
- + 303 more
✨ New Contributors
- @dependabot made their first contribution in https://github.com/microsoft/Accera/pull/42
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.5...v1.2.6
v1.2.5
📋 What's Changed
- Add link to the NCHWc 2D Convolution Case Study by @marina-neseem in https://github.com/microsoft/Accera/pull/41
- Merged PR 2593: [docs] [release] bump docs version to 1.2.5 in
- preparation for release. [Lisa Ong]
- bump docs version to 1.2.5 in preparation for release
- Merged PR 2586: Loop order and indices as parameters [Denny Sun]
- With this change, the user can write a schedule with loop_order parameterized:
- loop_order = create_parameters()
- schedule.reorder(order=loop_order )
- + 119 more
✨ New Contributors
- @marina-neseem made their first contribution in https://github.com/microsoft/Accera/pull/41
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.4...v1.2.5
v1.2.4
📋 What's Changed
- Docs refactoring install by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/27
- Revise Pi3_Cross_Compilation.md by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/28
- Docs refactoring tutorials hello matmul by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/29
- Docs refactoring tutorials hello matmul gpu by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/30
- Docs refactoring tutorials optimized matmul by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/31
- Refactoring of Accera.md from reference docs by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/32
- Complete refactoring of safety analysis by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/33
- Refactoring of functions docs in reference files by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/34
- + 208 more
v1.2.3
📋 What's Changed
- Docs refactoring manual fusing by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/26
- Merged PR 2508: [release] Bump docs version to 1.2.3. [Lisa Ong]
- In preparation for a PyPI release to facilitate community contributions for case studies
- Synced doc editorials from public Github repo
- Merged PR 2503: [prog] Support unsigned integer types in the DSL.
- [Lisa Ong]
- Add ScalarType.uint8/16/32/64 support
- Use UnrealizedConversionCastOps to convert these unsigned ints to signless ints
- + 145 more
v1.2.2
📋 What's Changed
- Full Changelog: https://github.com/microsoft/Accera/compare/v1.2.1...v1.2.2
- Add Ubuntu CI workflow by @lisaong in https://github.com/microsoft/Accera/pull/9
- Rework documentation sections by @lisaong in https://github.com/microsoft/Accera/pull/8
- Manually run script to update doc versions by @lisaong in https://github.com/microsoft/Accera/pull/10
- Show more perf gains for the quickstart example by @lisaong in https://github.com/microsoft/Accera/pull/12
- Fix post merge build break by @lisaong in https://github.com/microsoft/Accera/pull/14
- README.md refactoring by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/13
- Complete refactoring of file array.md and simple affine loop nests.md file in manual docs by @Arslan-e-Mustafa in https://github.com/microsoft/Accera/pull/16
- + 420 more
✨ New Contributors
- @Arslan-e-Mustafa made their first contribution in https://github.com/microsoft/Accera/pull/13
v1.2.1
📋 What's Changed
- Merged PR 2391: Update quickstart example, updated docs structure per
- feedback. [Lisa Ong]
- Teasers for transformations in the Quickstart sample (to differentiate Accera from others), with benchmarking
- Removed the Miscellaneous section, redistributed various docs to various related locations
- Renamed the cross compilation tutorial so that it is ordered last
- Merged PR 2392: Populate Target.Models based on known devices. [Kern
- Handa]
- Populate Target.Models based on known devices
- + 502 more
