Skip to content

Cloud Kernel Release 18

Compare
Choose a tag to compare
@casparant casparant released this 25 Mar 13:42
· 3470 commits to linux-next since this release

Cloud Kernel release 18 is rolling out! This release is rebased upon v4.19.91 LTS, let's see what else we're bringing to you (this is really a long changelog):

Highlight: Features, Enhancements and Bug Fixes from Alibaba Cloud Linux Internal Version

  • alinux: mm: add proc interface to control context readahead (Xiaoguang Wang)
  • alinux: Hookers: add arm64 support (Zou Cao)
  • alinux: mm, memcg: export workingset counters on memcg v1 (Xu Yu)
  • alinux: pci/iohub-sriov: Support for Alibaba PCIe IOHub SRIOV (liushanghui)
  • alinux: mm, memcg: abort priority oom if with oom victim (Xu Yu)
  • alinux: mm, memcg: account number of processes in the css (Xu Yu)
  • alinux: mm, memcg: fix soft lockup in priority oom (Xu Yu)
  • alinux: mm, memcg: record latency of memcg wmark reclaim (Xu Yu)
  • alinux: doc: use unified official project name Cloud Kernel (Caspar Zhang)
  • alinux: mm: oom_kill: show killed task's cgroup info in global oom (Wenwei Tao)
  • alinux: mm: memcontrol: enable oom.group on cgroup-v1 (Wenwei Tao)
  • alinux: doc: alibaba: Add priority oom descriptions (Wenwei Tao)
  • alinux: mm: memcontrol: introduce memcg priority oom (Wenwei Tao)
  • alinux: kernel: cgroup: account number of tasks in the css and its descendants (Wenwei Tao)
  • alinux: doc: Add Documentation/alibaba/interfaces.rst (Xunlei Pang)
  • alinux: memcg: Account throttled time due to memory.wmark_min_adj (Xunlei Pang)
  • alinux: memcg: Introduce memory.wmark_min_adj (Xunlei Pang)
  • alinux: memcg: Provide users the ability to reap zombie memcgs (Xunlei Pang)
  • alinux: jbd2: track slow handle which is preventing transaction committing (Xiaoguang Wang)
  • alinux: fs: record page or bio info while process is waitting on it (Xiaoguang Wang)
  • alinux: blk: add iohang check function (Xiaoguang Wang)
  • alinux: mm,memcg: export memory.{min,low} to cgroup v1 (Xu Yu)
  • alinux: mm,memcg: export memory.{events,events.local} to v1 (Xu Yu)
  • alinux: mm,memcg: export memory.high to v1 (Xu Yu)
  • alinux: arm64: add livepatch support (Zou Cao)
  • alinux: blk-throttle: fix logic error about BIO_THROTL_STATED in throtl_bio_end_io() (Xiaoguang Wang)
  • alinux: jbd2: fix build errors (Xiaoguang Wang)
  • alinux: mm: remove unused variable (Joseph Qi)
  • alinux: jbd2: fix build warnings (Joseph Qi)
  • alinux: mm: kidled: fix frame-larger-than build warning (Xu Yu)
  • alinux: mm: thp: remove deferred split queue from mem_cgroup (Caspar Zhang)
  • alinux: psi: using cpuacct_cgrp_id under CONFIG_CGROUP_CPUACCT (Joseph Qi)
  • alinux: iocost: fix format mismatch build warning (Joseph Qi)
  • alinux: mm: memcontrol: memcg_wmark_wq can be static (kbuild test robot)

New Features and Enhancements From Upstream

  • AMD CPU Enhancements
  • Hygon CPU Support
  • IOUring Support
  • cpuidle: Support guest halt polling (Yihao Wu)
  • mm: fix trying to reclaim unevictable lru page when calling madvise_pageout (zhong jiang)
  • mm: factor out common parts between MADV_COLD and MADV_PAGEOUT (Minchan Kim)
  • mm: introduce MADV_PAGEOUT (Minchan Kim)
  • mm: introduce MADV_COLD (Minchan Kim)
  • mm: change PAGEREF_RECLAIM_CLEAN with PAGE_REFRECLAIM (Minchan Kim)
  • arm64: mm: implement pte_devmap support (Shannon Zhao)
  • add the support of patchable-function-entry for hotfix kpatch with gcc 9.2 (Zou Cao)
  • KVM: arm64: Add support 1G hugepages at stage 2 (Shannon Zhao)
  • spi: spi: add GPIO chipselect support (Baoyou Xie)

Kernel Config Changes

  • configs: enable overlay redirect dir and inode index by default
  • configs: Build support for Alibaba PCIe IOHub SRIOV
  • configs: enable CONFIG_FTRACE_SYSCALLS on x86_64 kernel
  • configs: Enable arm64 hookers support
  • configs: enable CONFIG_LIVEPATCH for aarch64
  • configs: enable NVME block device support
  • configs: configs: enable intel idle driver
  • configs: enable guest halt polling support
  • configs: enable X86 PM timer support
  • configs: enable io wq for iouring
  • configs: add CGROUP_BPF support on X86
  • configs: add vmware support
  • configs: enable SOFT_WATCHDOG
  • configs: enable Hygon support
  • configs: enable iocost for aarch64
  • configs: enable CONFIG_BLK_DEBUG_FS by default
  • configs: add aarch64 config base
  • configs: enable deferred page init
  • configs: always enable THP by default
  • configs: enable iouring support

Other Bug Fixes

  • vfs: fix do_last() regression (Al Viro)
  • io-wq: wait for io_wq_create() to setup necessary workers (Jens Axboe) {CVE-2019-19241}
  • io_uring: async workers should inherit the user creds (Jens Axboe) {CVE-2019-19241}
  • io-wq: have io_wq_create() take a 'data' argument (Jens Axboe) {CVE-2019-19241}
  • io_wq: add get/put_work handlers to io_wq_create() (Jens Axboe) {CVE-2019-19241}
  • dccp: Fix memleak in __feat_register_sp (YueHaibing) {CVE-2019-20096}
  • scsi: libsas: stop discovering if oob mode is disconnected (Jason Yan) {CVE-2019-19965}
  • drm/i915/gen9: Clear residual context state on context switch (Akeem G Abodunrin) {CVE-2019-14615}
  • RDMA: Fix goto target to release the allocated memory (Navid Emamdoost) {CVE-2019-19077}
  • ipmi: Fix memory leak in __ipmi_bmc_register (Navid Emamdoost) {CVE-2019-19046}
  • vt: selection, close sel_buffer race (Jiri Slaby) {CVE-2020-8648}
  • vgacon: Fix a UAF in vgacon_invert_region (Zhang Xiaoxu) {CVE-2020-8647,CVE-2020-8649}
  • do_last(): fetch directory ->i_mode and ->i_uid before it's too late (Al Viro) {CVE-2020-8428}
  • x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit (Boris Ostrovsky) {CVE-2019-3016}
  • KVM: nVMX: Check IO instruction VM-exit conditions (Oliver Upton) {CVE-2020-2732}
  • KVM: nVMX: Refactor IO bitmap checks into helper function (Oliver Upton) {CVE-2020-2732}
  • KVM: nVMX: Don't emulate instructions in guest mode (Paolo Bonzini) {CVE-2020-2732}
  • mm: fix tick timer stall during deferred page init (Shile Zhang)
  • bpf/sockmap: Read psock ingress_msg before sk_receive_queue (Lingpeng Chen)
  • mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks() (Tetsuo Handa)
  • io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled (Xiaoguang Wang)
  • md: make sure desc_nr less than MD_SB_DISKS (Yufen Yu)
  • md: avoid invalid memory access for array sb->dev_roles (Yufen Yu)
  • md: no longer compare spare disk superblock events in super_load (Yufen Yu)
  • md: return -ENODEV if rdev has no mddev assigned (Pawel Baldysiak)
  • md/raid10: Fix raid10 replace hang when new added disk faulty (Alex Wu)
  • cpuidle: governor: Add new governors to cpuidle_governors again (Rafael J. Wysocki)
  • kvm: x86: add host poll control msrs (Marcelo Tosatti)
  • KVM: arm64: Opportunistically turn off WFI trapping when using direct LPI injection (Marc Zyngier)
  • KVM: vgic-v4: Track the number of VLPIs per vcpu (Marc Zyngier)
  • KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put (Marc Zyngier)
  • EDAC, skx: Retrieve and print retry_rd_err_log registers (Tony Luck)
  • tools headers uapi: Sync asm-generic/mman-common.h with the kernel (Arnaldo Carvalho de Melo)
  • tools build: Check if gettid() is available before providing helper (Arnaldo Carvalho de Melo)
  • efi: Make efi_rts_work accessible to efi page fault handler (Sai Praneeth)
  • netfilter: conntrack: udp: set stream timeout to 2 minutes (Florian Westphal)
  • netfilter: conntrack: udp: only extend timeout to stream mode after 2s (Florian Westphal)
  • iomap: Allow forcing of waiting for running DIO in iomap_dio_rw() (Jan Kara)
  • io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL (Xiaoguang Wang)
  • io_uring: add io_uring support (Joseph Qi)
  • ext4: start to support iopoll method (Xiaoguang Wang)
  • ext4: Move to shared i_rwsem even without dioread_nolock mount opt (Ritesh Harjani)
  • ext4: Start with shared i_rwsem in case of DIO instead of exclusive (Ritesh Harjani)
  • ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT (Ritesh Harjani)
  • ext4: introduce direct I/O write using iomap infrastructure (Matthew Bobrowski)
  • iomap: move the iomap_dio_rw ->end_io callback into a structure (Christoph Hellwig)
  • ext4: update ext4_sync_file() to not use __generic_file_fsync() (Matthew Bobrowski)
  • ext4: move inode extension check out from ext4_iomap_alloc() (Matthew Bobrowski)
  • ext4: move inode extension/truncate code out from ->iomap_end() callback (Matthew Bobrowski)
  • ext4: introduce direct I/O read using iomap infrastructure (Matthew Bobrowski)
  • ext4: introduce new callback for IOMAP_REPORT (Matthew Bobrowski)
  • iomap: use a srcmap for a read-modify-write I/O (Goldwyn Rodrigues)
  • ext4: split IOMAP_WRITE branch in ext4_iomap_begin() into helper (Matthew Bobrowski)
  • ext4: move set iomap routines into a separate helper ext4_set_iomap() (Matthew Bobrowski)
  • ext4: iomap that extends beyond EOF should be marked dirty (Matthew Bobrowski)
  • ext4: update direct I/O read lock pattern for IOCB_NOWAIT (Matthew Bobrowski)
  • ext4: reorder map.m_flags checks within ext4_iomap_begin() (Matthew Bobrowski)
  • x86/amd_nb: Make hygon_nb_misc_ids static (Pu Wen)
  • io-wq: add support for bounded vs unbunded work (Jens Axboe)
  • io-wq: io_wqe_run_queue() doesn't need to use list_empty_careful() (Jens Axboe)
  • io-wq: use proper nesting IRQ disabling spinlocks for cancel (Jens Axboe)
  • io-wq: use kfree_rcu() to simplify the code (YueHaibing)
  • net: add __sys_accept4_file() helper (Jens Axboe)
  • sched/core, workqueues: Distangle worker accounting from rq lock (Thomas Gleixner)
  • sched: Remove stale PF_MUTEX_TESTER bit (Thomas Gleixner)
  • ixgbe: Fix calculation of queue with VFs and flow director on interface flap (Cambda Zhu)
  • tcp: do not leave dangling pointers in tp->highest_sack (Eric Dumazet)
  • include/linux/notifier.h: SRCU: fix ctags (Sam Protsenko)
  • mm: thp: don't need care deferred split queue in memcg charge move path (Wei Yang)
  • signal: simplify set_user_sigmask/restore_user_sigmask (Oleg Nesterov)
  • block: never take page references for ITER_BVEC (Christoph Hellwig)
  • signal: remove the wrong signal_pending() check in restore_user_sigmask() (Oleg Nesterov)
  • uio: make import_iovec()/compat_import_iovec() return bytes on success (Jens Axboe)
  • blk-mq: fix NULL pointer deference in case no poll implementation (Joseph Qi)
  • req->error only used for iopoll (Stefan Bühler)
  • fs: add sync_file_range() helper (Jens Axboe)
  • drm/amdgpu/gmc: fix compiler errors [-Werror,-Wmissing-braces] (V2) (Shirish S)
  • add perf smmu-v3 support and fixed duplicate function (Zou Cao)
  • iommu/dma: Use NUMA aware memory allocations in __iommu_dma_alloc_pages() (Ganapatrao Kulkarni)
  • mm/hotplug: make remove_memory() interface usable (Pavel Tatashin)
  • mm/memory_hotplug: make remove_memory() take the device_hotplug_lock (David Hildenbrand)
  • mm: initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections (Alexander Duyck)
  • mm: implement new zone specific memblock iterator (Alexander Duyck)
  • mm: drop meminit_pfn_in_nid as it is redundant (Alexander Duyck)
  • mm: use mm_zero_struct_page from SPARC on all 64b architectures (Alexander Duyck)
  • nvme-mpath: remove I/O polling support (Christoph Hellwig)
  • amd-gpu: Don't undefine READ and WRITE (David Howells)
  • blk-mq: grab .q_usage_counter when queuing request from plug code path (Ming Lei)
  • block/bfq: fix ifdef for CONFIG_BFQ_GROUP_IOSCHED=y (Konstantin Khlebnikov)
  • block: remove bogus check for queue_lock assignment (Jens Axboe)
  • block: don't use bio->bi_vcnt to figure out segment number (Ming Lei)
  • scsi: core: Run queue when state is set to running after being blocked (zhengbin)
  • block: fix NULL pointer dereference in register_disk (zhengbin)
  • blk-mq: Add a NULL check in blk_mq_free_map_and_requests() (Dan Carpenter)
  • blk-mq: place trace_block_getrq() in correct place (Xiaoguang Wang)
  • blk-mq: protect debugfs_create_files() from failures (Greg Kroah-Hartman)
  • blk-mq: not embed .mq_kobj and ctx->kobj into queue instance (Ming Lei)
  • blk-mq: fallback to previous nr_hw_queues when updating fails (Jianchao Wang)
  • blk-mq: realloc hctx when hw queue is mapped to another node (Jianchao Wang)
  • blk-mq: adjust debugfs and sysfs register when updating nr_hw_queues (Jianchao Wang)
  • mm/memblock.c: skip kmemleak for kasan_init() (Qian Cai)
  • tpm: tpm_tis_spi: Introduce a flow control callback (Stephen Boyd)
  • tcp: Add snd_wnd to TCP_INFO (Thomas Higdon)
  • tcp: Add TCP_INFO counter for packets received out-of-order (Thomas Higdon)
  • mm: don't raise MEMCG_OOM event due to failed high-order allocation (Roman Gushchin)
  • mm, memcg: introduce memory.events.local (Shakeel Butt)
  • mm, memcg: consider subtrees in memory.events (Chris Down)
  • iio: adc: ti-ads7950: use SPI_CS_WORD to reduce CPU usage (David Lechner)
  • spi: spi-davinci: Add support for SPI_CS_WORD (David Lechner)
  • spi: add software implementation for SPI_CS_WORD (David Lechner)
  • spi: add new SPI_CS_WORD flag (David Lechner)
  • spi: davinci: Remove chip select GPIO pdata (Linus Walleij)
  • block: fix 32 bit overflow in __blkdev_issue_discard() (Dave Chinner)
  • block: cleanup __blkdev_issue_discard() (Ming Lei)
  • iov_iter: fix iov_iter_type (Ming Lei)
  • tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistd (Arnaldo Carvalho de Melo)
  • block: add BIO_NO_PAGE_REF flag (Jens Axboe)
  • iov_iter: add ITER_BVEC_FLAG_NO_REF flag (Jens Axboe)
  • net: split out functions related to registering inflight socket files (Jens Axboe)
  • block: implement bio helper to add iter bvec pages to bio (Jens Axboe)
  • fs: add fget_many() and fput_many() (Jens Axboe)
  • xfs: Fix stale data exposure when readahead races with hole punch (Jan Kara)
  • fs: Export generic_fadvise() (Jan Kara)
  • xfs: fix missed wakeup on l_flush_wait (Rik van Riel)
  • fs: xfs: xfs_log: Don't use KM_MAYFAIL at xfs_log_reserve(). (Tetsuo Handa)
  • xfs: fix off-by-one error in rtbitmap cross-reference (Darrick J. Wong)
  • xfs: unlock inode when xfs_ioctl_setattr_get_trans can't get transaction (Darrick J. Wong)
  • xfs: fix backwards endian conversion in scrub (Darrick J. Wong)
  • xfs: libxfs: move xfs_perag_put late (Pan Bian)
  • xfs: finobt AG reserves don't consider last AG can be a runt (Dave Chinner)
  • exportfs: fix 'passing zero to ERR_PTR()' warning (YueHaibing)
  • NFS: change sign of nfs_fh length (Frank Sorenson)
  • nfs: fix xfstest generic/099 failed on nfsv3 (ZhangXiaoxu)
  • fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback (Amir Goldstein)
  • sysfs: convert BUG_ON to WARN_ON (Greg Kroah-Hartman)
  • ext4: fix integer overflow when calculating commit interval (zhangyi (F))
  • ext4: cond_resched in work-heavy group loops (Khazhismel Kumykov)
  • jbd2: discard dirty data when forgetting an un-journalled buffer (zhangyi (F))
  • ext4: replace opencoded i_writecount usage with inode_is_open_for_write() (Nikolay Borisov)
  • block: introduce mp_bvec_for_each_page() for iterating over page (Ming Lei)
  • block: introduce bvec_nth_page() (Joseph Qi)
  • iomap: wire up the iopoll method (Christoph Hellwig)
  • block: add bio_set_polled() helper (Jens Axboe)
  • block: wire up block device iopoll method (Christoph Hellwig)
  • fs: add an iopoll method to struct file_operations (Christoph Hellwig)
  • block: clear REQ_HIPRI if polling is not supported (Christoph Hellwig)
  • signal: Add restore_user_sigmask() (Deepa Dinamani)
  • signal: Add set_user_sigmask() (Deepa Dinamani)
  • block: remove ->poll_fn (Christoph Hellwig)
  • block: make blk_poll() take a parameter on whether to spin or not (Jens Axboe)
  • blk-mq: when polling for IO, look for any completion (Jens Axboe)
  • block: Introduce get_current_ioprio() (Damien Le Moal)
  • block: have ->poll_fn() return number of entries polled (Jens Axboe)
  • block: for async O_DIRECT, mark us as polling if asked to (Jens Axboe)
  • block: add REQ_HIPRI and inherit it from IOCB_HIPRI (Jens Axboe)
  • iov_iter: Separate type from direction and use accessor functions (David Howells)
  • iov_iter: Use accessor function (David Howells)
  • EDAC: skx_common: downgrade message importance on missing PCI device (Aristeu Rozanski)
  • tcp: Fix highest_sack and highest_sack_seq (Cambda Zhu)