android_kernel_xiaomi_sdm845/mm
Ilya Lipnitskiy 10bb21b828 mm: fix race by making init_zero_pfn() early_initcall
commit e720e7d0e983bf05de80b231bccc39f1487f0f16 upstream.

There are code paths that rely on zero_pfn to be fully initialized
before core_initcall.  For example, wq_sysfs_init() is a core_initcall
function that eventually results in a call to kernel_execve, which
causes a page fault with a subsequent mmput.  If zero_pfn is not
initialized by then it may not get cleaned up properly and result in an
error:

  BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1

Here is an analysis of the race as seen on a MIPS device. On this
particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until
initialized, at which point it becomes PFN 5120:

  1. wq_sysfs_init calls into kobject_uevent_env at core_initcall:
       kobject_uevent_env+0x7e4/0x7ec
       kset_register+0x68/0x88
       bus_register+0xdc/0x34c
       subsys_virtual_register+0x34/0x78
       wq_sysfs_init+0x1c/0x4c
       do_one_initcall+0x50/0x1a8
       kernel_init_freeable+0x230/0x2c8
       kernel_init+0x10/0x100
       ret_from_kernel_thread+0x14/0x1c

  2. kobject_uevent_env() calls call_usermodehelper_exec() which executes
     kernel_execve asynchronously.

  3. Memory allocations in kernel_execve cause a page fault, bumping the
     MM reference counter:
       add_mm_counter_fast+0xb4/0xc0
       handle_mm_fault+0x6e4/0xea0
       __get_user_pages.part.78+0x190/0x37c
       __get_user_pages_remote+0x128/0x360
       get_arg_page+0x34/0xa0
       copy_string_kernel+0x194/0x2a4
       kernel_execve+0x11c/0x298
       call_usermodehelper_exec_async+0x114/0x194

  4. In case zero_pfn has not been initialized yet, zap_pte_range does
     not decrement the MM_ANONPAGES RSS counter and the BUG message is
     triggered shortly afterwards when __mmdrop checks the ref counters:
       __mmdrop+0x98/0x1d0
       free_bprm+0x44/0x118
       kernel_execve+0x160/0x1d8
       call_usermodehelper_exec_async+0x114/0x194
       ret_from_kernel_thread+0x14/0x1c

To avoid races such as described above, initialize init_zero_pfn at
early_initcall level.  Depending on the architecture, ZERO_PAGE is
either constant or gets initialized even earlier, at paging_init, so
there is no issue with initializing zero_pfn earlier.

Link: https://lkml.kernel.org/r/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niAXOpqEbt7w@mail.gmail.com
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: stable@vger.kernel.org
Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07 12:05:40 +02:00
..
kasan kasan: avoid -Wmaybe-uninitialized warning 2019-05-08 07:19:07 +02:00
backing-dev.c memcg: fix a crash in wb_workfn when a device disappears 2021-02-23 13:59:15 +01:00
balloon_compaction.c
bootmem.c
cleancache.c
cma_debug.c mm/cma_debug.c: fix the break condition in cma_maxchunk_get() 2019-06-22 08:17:12 +02:00
cma.c mm/cma.c: fail if fixed declaration can't be honored 2019-08-06 18:29:37 +02:00
cma.h
compaction.c
debug_page_ref.c
debug.c mm: get rid of vmacache_flush_all() entirely 2018-09-19 22:47:17 +02:00
dmapool.c
early_ioremap.c mm/early_ioremap: Fix boot hang with earlyprintk=efi,keep 2018-02-25 11:05:49 +01:00
fadvise.c mm/fadvise.c: fix signed overflow UBSAN complaint 2018-09-15 09:42:57 +02:00
failslab.c
filemap.c mm/filemap.c: clear page error before actual read 2020-10-01 20:40:11 +02:00
frame_vector.c mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' 2018-11-10 07:42:52 -08:00
frontswap.c
gup.c mm: prevent get_user_pages() from overflowing page refcount 2019-06-11 12:22:45 +02:00
highmem.c
huge_memory.c mm, thp: make do_huge_pmd_wp_page() lock page for testing mapcount 2021-03-03 17:44:32 +01:00
hugetlb_cgroup.c mm: hugetlb: switch to css_tryget() in hugetlb_cgroup_charge_cgroup() 2019-11-25 09:51:58 +01:00
hugetlb.c mm/hugetlb.c: fix unnecessary address expansion of pmd sharing 2021-03-07 11:25:57 +01:00
hwpoison-inject.c
init-mm.c
internal.h vmscan: return NODE_RECLAIM_NOSCAN in node_reclaim() when CONFIG_NUMA is n 2019-12-05 15:35:02 +01:00
interval_tree.c
Kconfig mm: don't allow deferred pages with NEED_PER_CPU_KM 2018-05-22 16:57:57 +02:00
Kconfig.debug
khugepaged.c mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged 2020-10-14 09:48:17 +02:00
kmemcheck.c
kmemleak-test.c
kmemleak.c mm/kmemleak.c: fix check for softirq context 2019-08-04 09:33:41 +02:00
ksm.c mm/ksm.c: don't WARN if page is still mapped in remove_stable_node() 2019-11-28 18:28:09 +01:00
list_lru.c mm/list_lru.c: fix memory leak in __memcg_init_list_lru_node 2019-06-22 08:17:18 +02:00
maccess.c uaccess: Add non-pagefault user-space write function 2020-09-12 11:47:35 +02:00
madvise.c mm: madvise(MADV_DODUMP): allow hugetlbfs pages 2018-10-10 08:53:20 +02:00
Makefile
memblock.c memblock: do not start bottom-up allocations with kernel_end 2021-02-23 13:59:16 +01:00
memcontrol.c mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback() 2021-02-23 13:59:14 +01:00
memory_hotplug.c mm: Avoid calling build_all_zonelists_init under hotplug context 2020-08-21 11:02:11 +02:00
memory-failure.c mm: hwpoison: fix thp split handing in soft_offline_in_use_page() 2019-03-23 13:19:49 +01:00
memory.c mm: fix race by making init_zero_pfn() early_initcall 2021-04-07 12:05:40 +02:00
mempolicy.c mm: mempolicy: fix potential pte_unmap_unlock pte error 2020-11-18 18:26:23 +01:00
mempool.c
memtest.c
migrate.c hugetlbfs: fix races and page leaks during migration 2019-03-13 14:04:54 -07:00
mincore.c mm/mincore.c: make mincore() more conservative 2019-05-21 18:48:58 +02:00
mlock.c mm/mlock.c: change count_mm_mlocked_page_nr return type 2019-07-10 09:55:43 +02:00
mm_init.c
mmap.c mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area 2020-10-01 20:40:11 +02:00
mmu_context.c
mmu_notifier.c mm/mmu_notifier: use hlist_add_head_rcu() 2019-08-04 09:33:42 +02:00
mmzone.c
mprotect.c x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings 2018-08-15 18:14:45 +02:00
mremap.c mm: Fix mremap not considering huge pmd devmap 2020-06-11 09:22:20 +02:00
msync.c
nobootmem.c mm: discard memblock data later 2017-08-24 17:12:19 -07:00
nommu.c x86/mm: split vmalloc_sync_all() 2020-04-02 17:20:26 +02:00
oom_kill.c oom, oom_reaper: do not enqueue same task twice 2019-02-12 19:45:02 +01:00
page_alloc.c mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged 2020-10-14 09:48:17 +02:00
page_counter.c
page_ext.c mm/page_ext.c: fix an imbalance with kmemleak 2019-04-05 22:29:06 +02:00
page_idle.c mm/page_idle.c: fix oops because end_pfn is larger than max_pfn 2019-07-10 09:55:38 +02:00
page_io.c swap: fix swapfile read/write offset 2021-03-07 11:25:59 +01:00
page_isolation.c
page_owner.c
page_poison.c
page-writeback.c mm: memcontrol: fix NULL pointer crash in test_clear_page_writeback() 2021-02-23 13:59:14 +01:00
pagewalk.c mm: pagewalk: fix termination condition in walk_pte_range() 2020-10-01 20:40:06 +02:00
percpu-km.c
percpu-vm.c
percpu.c percpu: stop printing kernel addresses 2019-04-27 09:34:47 +02:00
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c mm: migration: fix migration of huge PMD shared pages 2018-11-21 09:26:03 +01:00
shmem.c shmem: fix possible deadlocks on shmlock_user_lock 2020-05-20 08:15:33 +02:00
slab_common.c mm/slab: use memzero_explicit() in kzfree() 2020-06-30 15:38:44 -04:00
slab.c mm/slab.c: fix an infinite loop in leaks_show() 2019-06-22 08:17:13 +02:00
slab.h
slob.c
slub.c Revert "mm, slub: consider rest of partial list if acquire_slab() fails" 2021-03-17 16:10:14 +01:00
sparse-vmemmap.c
sparse.c mm/memory_hotplug: set magic number to page->freelist instead of page->lru.next 2017-10-21 17:21:36 +02:00
swap_cgroup.c mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() 2017-07-05 14:40:17 +02:00
swap_state.c mm: fix swap cache node allocation mask 2020-07-09 09:35:54 +02:00
swap.c
swapfile.c swap: fix swapfile read/write offset 2021-03-07 11:25:59 +01:00
truncate.c mm: cleancache: fix corruption on missed inode invalidation 2018-12-08 13:05:09 +01:00
usercopy.c usercopy: Avoid HIGHMEM pfn warning 2019-10-17 13:42:06 -07:00
userfaultfd.c
util.c mm: page_mapped: don't assume compound page is huge or THP 2019-01-16 22:12:32 +01:00
vmacache.c mm: get rid of vmacache_flush_all() entirely 2018-09-19 22:47:17 +02:00
vmalloc.c mm/vmalloc.c: don't dereference possible NULL pointer in __vunmap() 2020-06-03 08:16:47 +02:00
vmpressure.c
vmscan.c mm: remove seemingly spurious reclaimability check from laptop_mode gating 2018-09-19 22:47:12 +02:00
vmstat.c mm, vmstat: hide /proc/pagetypeinfo from normal users 2019-11-12 19:15:43 +01:00
workingset.c
z3fold.c
zbud.c
zpool.c
zsmalloc.c zsmalloc: account the number of compacted pages correctly 2021-03-07 11:25:59 +01:00
zswap.c zswap: re-check zswap_is_full() after do zswap_shrink() 2018-09-05 09:20:02 +02:00