-
Notifications
You must be signed in to change notification settings - Fork 732
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Port merge_embeddings benchmark to tritonbench
cla signed
fb-exported
meta-exported
#5650
opened Apr 16, 2026 by
q10
Contributor
Loading…
Validate total_num_blocks divisibility by my_size in block_bucketize (#5646)
cla signed
fb-exported
meta-exported
#5649
opened Apr 16, 2026 by
q10
Contributor
Loading…
Fix bf16 rounding to IEEE 754 ties-to-even
cla signed
#5648
opened Apr 16, 2026 by
cyyever
Contributor
Loading…
Validate total_num_blocks divisibility by my_size in block_bucketize
cla signed
#5646
opened Apr 16, 2026 by
cyyever
Contributor
Loading…
Skip scratch pad eviction data in enrichment mode to avoid cudaFree overhead
cla signed
fb-exported
meta-exported
#5645
opened Apr 16, 2026 by
EddyLXJ
Contributor
Loading…
Add CPU support in fbgemm for FloatToFP8RowwiseQuantized and FP8RowwiseQuantizedToFloat
cla signed
fb-exported
meta-exported
#5644
opened Apr 15, 2026 by
djjatmeta
Loading…
Support multi-dimensional runtime_meta in RES streaming buffers by lazy init (#5643)
cla signed
fb-exported
meta-exported
#5643
opened Apr 15, 2026 by
FriedCosey
Loading…
Fix TBE v2 forward kernel for embedding dim > 1024 (#5326) (#5569)
cla signed
fb-exported
meta-exported
#5641
opened Apr 15, 2026 by
q10
Contributor
Loading…
Add UVM host-mapped memory support for dense TBE kernel (#5640)
cla signed
fb-exported
meta-exported
#5640
opened Apr 15, 2026 by
TroyGarden
Contributor
Loading…
Fix intra-warp and inter-warp race conditions in bounds_check_indices v1 and v2 CUDA kernels
cla signed
fb-exported
meta-exported
#5638
opened Apr 15, 2026 by
gchalump
Contributor
Loading…
Add missing async proxy fence
cla signed
fb-exported
meta-exported
#5637
opened Apr 15, 2026 by
lw
Contributor
Loading…
Add aligned_unique_ptr RAII wrapper to avoid leak risks (#5609)
cla signed
fb-exported
meta-exported
#5615
opened Apr 11, 2026 by
q10
Contributor
Loading…
Add CUDA 13.2 support to CI and release workflows (#5610)
cla signed
fb-exported
meta-exported
#5610
opened Apr 10, 2026 by
gchalump
Contributor
Loading…
Port batched_dense_vec_jagged_2d_mul and jagged_1d_to_truncated_values to tritonbench
cla signed
fb-exported
meta-exported
#5603
opened Apr 9, 2026 by
q10
Contributor
Loading…
Replace rocm-smi with amd-smi across ROCm build, CI, and docs
cla signed
module: rocm
#5597
opened Apr 8, 2026 by
adam360x
Loading…
3 tasks done
bf16 scale/bias for INT4 (#5595)
cla signed
fb-exported
meta-exported
#5595
opened Apr 8, 2026 by
jeetkanjani7
Loading…
Enable more clang-tidy checks on C++20 (#5575)
cla signed
fb-exported
meta-exported
module: rocm
#5588
opened Apr 7, 2026 by
q10
Contributor
Loading…
Add gflag to select feature names for SSD KV embedding table
cla signed
fb-exported
meta-exported
#5585
opened Apr 7, 2026 by
jnwan
Loading…
Split RowWiseSparseAdagradFused.cc.stripped.o from fbcode//admarket/adfinder:adfinder
cla signed
fb-exported
meta-exported
#5578
opened Apr 6, 2026 by
meta-codesync
bot
Loading…
Port expand_into_jagged_permute benchmark to tritonbench
cla signed
fb-exported
meta-exported
#5566
opened Apr 1, 2026 by
q10
Contributor
Loading…
Fix bash scripts to fail correctly for ROCm jobs (#5564)
ciflow/rocm-mi300
cla signed
fb-exported
meta-exported
module: rocm
#5564
opened Mar 31, 2026 by
q10
Contributor
Loading…
Add AMD/ROCm support for SSD TBE inference
cla signed
fb-exported
meta-exported
module: rocm
#5561
opened Mar 31, 2026 by
goldcoderZ
Contributor
Loading…
Add TurboSSDInferenceModule for HSTU serving integration
cla signed
fb-exported
meta-exported
#5560
opened Mar 31, 2026 by
goldcoderZ
Contributor
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.