Discussion:
iwl3945: Don't queue rfkill_poll work when module is exiting
(too old to reply)
TJ
2009-03-24 15:44:19 UTC
Permalink
Tim Gardner suggested I forward this as a possible stable-release
update. I found the problem in our current Ubuntu Jaunty tree - iwlwifi
version 1.2.26k. What follows is a copy of the patch for Jaunty
(2.6.28).

----------
Bug: #345710

When the wireless interface is active and the iwl3945 module is unloaded the
call to ieee80211_unregister_hw() would call iwl3945_mac_stop() which would
restart the delayed workqueue for rfkill_poll. That workqueue had already been
cancelled so when the next work item was run (2 seconds later) the system would
suffer a hard lock-up because the module had been unloaded by then.

This patch implements STATUS_EXIT_PENDING checks in places where the rfkill_poll
work is scheduled, and moves the final workqueue cancellation to occur after the
call to ieee80211_unregister_hw().

Bug discovered, experienced and fix tested on my PC.

Signed-off-by: TJ <ubuntu-A+***@public.gmane.org>
---
drivers/net/wireless/iwlwifi/iwl3945-base.c | 16 +++++++++++++---
1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c
index bb92db2..acaf038 100644
--- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
+++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
@@ -6062,7 +6062,9 @@ static void iwl3945_rfkill_poll(struct work_struct *data)
if (test_bit(STATUS_RF_KILL_HW, &status) != test_bit(STATUS_RF_KILL_HW, &priv->status))
queue_work(priv->workqueue, &priv->rf_kill);

- queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
+ /* only queue if module isn't exiting */
+ if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
+ queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
round_jiffies_relative(2 * HZ));

}
@@ -6588,7 +6590,10 @@ static void iwl3945_mac_stop(struct ieee80211_hw *hw)
flush_workqueue(priv->workqueue);

/* start polling the killswitch state again */
- queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
+
+ /* only queue if module isn't exiting */
+ if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
+ queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
round_jiffies_relative(2 * HZ));

IWL_DEBUG_MAC80211("leave\n");
@@ -8166,7 +8171,6 @@ static void __devexit iwl3945_pci_remove(struct pci_dev *pdev)
sysfs_remove_group(&pdev->dev.kobj, &iwl3945_attribute_group);

iwl3945_rfkill_unregister(priv);
- cancel_delayed_work(&priv->rfkill_poll);
iwl3945_dealloc_ucode_pci(priv);

if (priv->rxq.bd)
@@ -8182,6 +8186,12 @@ static void __devexit iwl3945_pci_remove(struct pci_dev *pdev)
/*netif_stop_queue(dev); */
flush_workqueue(priv->workqueue);

+ /* ieee80211_unregister_hw calls iwl3945_mac_stop which used to restart the rfkill
+ polling. Although that now checks STATUS_EXIT_PENDING do cancel and wait for any
+ pending work to complete */
+ cancel_delayed_work(&priv->rfkill_poll);
+ cancel_work_sync(&priv->rfkill_poll);
+
/* ieee80211_unregister_hw calls iwl3945_mac_stop, which flushes
* priv->workqueue... so we can't take down the workqueue
* until now... */
--
1.6.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Helmut Schaa
2009-03-24 17:51:11 UTC
Permalink
Post by TJ
Tim Gardner suggested I forward this as a possible stable-release
update. I found the problem in our current Ubuntu Jaunty tree - iwlwi=
fi
Post by TJ
version 1.2.26k. What follows is a copy of the patch for Jaunty
(2.6.28).
The patch that introduced the rfkill polling did not hit mainline yet. =
It
should pop up in 2.6.30, hence this patch is not needed for any stable
series.

I wasn't able to reproduce the issue by unloading iwl3945 here. Any hin=
ts
on how to reproduce this issue?

Thanks,
Helmut
Post by TJ
----------
Bug: #345710
=20
When the wireless interface is active and the iwl3945 module is unloa=
ded the
Post by TJ
call to ieee80211_unregister_hw() would call iwl3945_mac_stop() which=
would
Post by TJ
restart the delayed workqueue for rfkill_poll. That workqueue had alr=
eady been
Post by TJ
cancelled so when the next work item was run (2 seconds later) the sy=
stem would
Post by TJ
suffer a hard lock-up because the module had been unloaded by then.
=20
This patch implements STATUS_EXIT_PENDING checks in places where the =
rfkill_poll
Post by TJ
work is scheduled, and moves the final workqueue cancellation to occu=
r after the
Post by TJ
call to ieee80211_unregister_hw().
=20
Bug discovered, experienced and fix tested on my PC.
=20
---
drivers/net/wireless/iwlwifi/iwl3945-base.c | 16 +++++++++++++---
1 files changed, 13 insertions(+), 3 deletions(-)
=20
diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/ne=
t/wireless/iwlwifi/iwl3945-base.c
Post by TJ
index bb92db2..acaf038 100644
--- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
+++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
@@ -6062,7 +6062,9 @@ static void iwl3945_rfkill_poll(struct work_str=
uct *data)
Post by TJ
if (test_bit(STATUS_RF_KILL_HW, &status) !=3D test_bit(STATUS_RF_KI=
LL_HW, &priv->status))
Post by TJ
queue_work(priv->workqueue, &priv->rf_kill);
=20
- queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
+ /* only queue if module isn't exiting */
+ if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
+ queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
round_jiffies_relative(2 * HZ));
=20
}
@@ -6588,7 +6590,10 @@ static void iwl3945_mac_stop(struct ieee80211_=
hw *hw)
Post by TJ
flush_workqueue(priv->workqueue);
=20
/* start polling the killswitch state again */
- queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
+
+ /* only queue if module isn't exiting */
+ if (! test_bit(STATUS_EXIT_PENDING, &priv->status))
+ queue_delayed_work(priv->workqueue, &priv->rfkill_poll,
round_jiffies_relative(2 * HZ));
=20
IWL_DEBUG_MAC80211("leave\n");
@@ -8166,7 +8171,6 @@ static void __devexit iwl3945_pci_remove(struct=
pci_dev *pdev)
Post by TJ
sysfs_remove_group(&pdev->dev.kobj, &iwl3945_attribute_group);
=20
iwl3945_rfkill_unregister(priv);
- cancel_delayed_work(&priv->rfkill_poll);
iwl3945_dealloc_ucode_pci(priv);
=20
if (priv->rxq.bd)
@@ -8182,6 +8186,12 @@ static void __devexit iwl3945_pci_remove(struc=
t pci_dev *pdev)
Post by TJ
/*netif_stop_queue(dev); */
flush_workqueue(priv->workqueue);
=20
+ /* ieee80211_unregister_hw calls iwl3945_mac_stop which used to res=
tart the rfkill
Post by TJ
+ polling. Although that now checks STATUS_EXIT_PENDING do cancel =
and wait for any
Post by TJ
+ pending work to complete */
+ cancel_delayed_work(&priv->rfkill_poll);
+ cancel_work_sync(&priv->rfkill_poll);
+
/* ieee80211_unregister_hw calls iwl3945_mac_stop, which flushes
* priv->workqueue... so we can't take down the workqueue
* until now... */
--
To unsubscribe from this list: send the line "unsubscribe linux-wireles=
s" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
TJ
2009-03-24 21:22:19 UTC
Permalink
Post by TJ
Tim Gardner suggested I forward this as a possible stable-release
update. I found the problem in our current Ubuntu Jaunty tree - iwl=
wifi
Post by TJ
version 1.2.26k. What follows is a copy of the patch for Jaunty
(2.6.28).
=20
The patch that introduced the rfkill polling did not hit mainline yet=
=2E It
should pop up in 2.6.30, hence this patch is not needed for any stabl=
e
series.
=20
I wasn't able to reproduce the issue by unloading iwl3945 here. Any h=
ints
on how to reproduce this issue?
lspci -nn -s 06:00.0
06:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 3945A=
BG [Golan] Network Connection [8086:4222] (rev 02)

With the interface active and in use do:

sudo modprobe -r iwl3945

Approximately 2 seconds later the PC locks solid and requires a
power-off button restart.

Confirmed by another of our kernel team:

https://launchpad.net/bugs/345710

--
To unsubscribe from this list: send the line "unsubscribe linux-wireles=
s" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Huaxu Wan
2009-03-25 03:41:21 UTC
Permalink
Hi,

This issue is confirmed here too. But during compile, there is warning =
about
cancel_work_sync(), cancel_delayed_work_sync() should be used instead.

And, without the modification in iwl3945_rfkill_poll() and=20
iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this=
issue
is also fixed in my testing.

Best regards
Huaxu

PS: The following is the log when iwl3945 was unloaded before patched.

***@Januty-devel:~# modprobe -r iwl3945
[ 164.840610] iwl3945: U iwl3945_pci_remove *** UNLOAD DRIVER ***
[ 164.846515] iwl3945: U __iwl3945_down iwl3945 is going down
[ 164.852315] iwl3945: U iwl3945_hw_nic_stop_master stop master
[ 164.858634] iwl3945: U iwl3945_clear_free_frames 0 frames on pre-all=
ocated heap on clear.
[ 164.917177] iwl3945: U __iwl3945_down iwl3945 is going down
[ 164.922735] iwl3945: U iwl3945_clear_free_frames 0 frames on pre-all=
ocated heap on clear.
[ 166.804090] BUG: unable to handle kernel paging request at 000077ff8=
b64b8c7
[ 166.808082] IP: [<ffffffff80262b62>] delayed_work_timer_fn+0x32/0x40
[ 166.808082] PGD 0=20
[ 166.808082] Oops: 0000 [#1] SMP=20
[ 166.808082] last sysfs file: /sys/module/cfg80211/refcnt
[ 166.808082] Dumping ftrace buffer:
[ 166.808082] (ftrace buffer empty)
[ 166.808082] CPU 1=20
[ 166.808082] Modules linked in: lp parport arc4 ecb psmouse iTCO_wdt =
video serio_raw pcspkr usbhid intel_agp iTCO_vendor_support output ehci=
_hcd uhci_hcd ]
[ 166.808082] Pid: 0, comm: swapper Not tainted 2.6.28.8 #1
[ 166.808082] RIP: 0010:[<ffffffff80262b62>] [<ffffffff80262b62>] del=
ayed_work_timer_fn+0x32/0x40
[ 166.808082] RSP: 0018:ffff88007b62be80 EFLAGS: 00010246
[ 166.808082] RAX: 000077ff8b64b8bf RBX: ffff8800738df7c8 RCX: ffff880=
0738df7c8
[ 166.808082] RDX: 0000000000000001 RSI: ffff8800738df7c8 RDI: ffff880=
0738df7c8
[ 166.808082] RBP: ffff88007b62be80 R08: ffff8800738df820 R09: 0000000=
000000000
[ 166.808082] R10: ffff88007b625e18 R11: 0000000000000000 R12: 0000000=
000000100
[ 166.808082] R13: ffff88007b614000 R14: ffff88007b62beb0 R15: fffffff=
f80262b30
[ 166.808082] FS: 0000000000000000(0000) GS:ffff88007bc02c00(0000) kn=
lGS:0000000000000000
[ 166.808082] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 166.808082] CR2: 000077ff8b64b8c7 CR3: 0000000000201000 CR4: 0000000=
0000006a0
[ 166.808082] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000=
000000000
[ 166.808082] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000=
000000400
[ 166.808082] Process swapper (pid: 0, threadinfo ffff88007b624000, ta=
sk ffff88007b61c320)
[ 166.808082] Stack:
[ 166.808082] ffff88007b62bef0 ffffffff80259e39 ffff88007b615c18 ffff=
88007b615818
[ 166.808082] ffff88007b615418 ffff88007b615018 ffff88007b62beb0 ffff=
88007b62beb0
[ 166.808082] ffffffff80271efd 0000000000000008 0000000000000001 0000=
000000000100
[ 166.808082] Call Trace:
[ 166.808082] <IRQ> <0> [<ffffffff80259e39>] run_timer_softirq+0x179/=
0x260
[ 166.808082] [<ffffffff80271efd>] ? tick_handle_oneshot_broadcast+0x=
ed/0x100
[ 166.808082] [<ffffffff80254cfc>] __do_softirq+0x9c/0x170
[ 166.808082] [<ffffffff80213cac>] call_softirq+0x1c/0x30
[ 166.808082] [<ffffffff80214f0d>] do_softirq+0x5d/0xa0
[ 166.808082] [<ffffffff80254a7d>] irq_exit+0x8d/0xa0
[ 166.808082] [<ffffffff802151d5>] do_IRQ+0xc5/0x110
[ 166.808082] [<ffffffff80212b13>] ret_from_intr+0x0/0x29
[ 166.808082] <EOI> <0> [<ffffffff8046d906>] ? acpi_idle_enter_simple=
+0x166/0x1a4
[ 166.808082] [<ffffffff8046d8fe>] ? acpi_idle_enter_simple+0x15e/0x1=
a4
[ 166.808082] [<ffffffff805745a5>] ? cpuidle_idle_call+0xa5/0x100
[ 166.808082] [<ffffffff80210e85>] ? cpu_idle+0x65/0xc0
[ 166.808082] [<ffffffff8067e6fd>] ? start_secondary+0x168/0x1bb
[ 166.808082] Code: 65 8b 14 25 24 00 00 00 48 89 e5 48 83 e0 fc 48 8b=
40 38 8b 70 20 48 8b 00 85 f6 0f 45 15 ff c9 72 00 48 f7 d0 48 89 ce 4=
8 63 d2 <48> 8=20
[ 166.808082] RIP [<ffffffff80262b62>] delayed_work_timer_fn+0x32/0x4=
0
[ 166.808082] RSP <ffff88007b62be80>
[ 166.808082] CR2: 000077ff8b64b8c7
[ 167.086916] Kernel panic - not syncing: Fatal exception in interrupt
Post by TJ
Post by TJ
Tim Gardner suggested I forward this as a possible stable-release
update. I found the problem in our current Ubuntu Jaunty tree - i=
wlwifi
Post by TJ
Post by TJ
version 1.2.26k. What follows is a copy of the patch for Jaunty
(2.6.28).
=20
The patch that introduced the rfkill polling did not hit mainline y=
et. It
Post by TJ
should pop up in 2.6.30, hence this patch is not needed for any sta=
ble
Post by TJ
series.
=20
I wasn't able to reproduce the issue by unloading iwl3945 here. Any=
hints
Post by TJ
on how to reproduce this issue?
=20
lspci -nn -s 06:00.0
06:00.0 Network controller [0280]: Intel Corporation PRO/Wireless 394=
5ABG [Golan] Network Connection [8086:4222] (rev 02)
Post by TJ
=20
=20
sudo modprobe -r iwl3945
=20
Approximately 2 seconds later the PC locks solid and requires a
power-off button restart.
=20
=20
https://launchpad.net/bugs/345710
=20
--
To unsubscribe from this list: send the line "unsubscribe linux-wirel=
ess" in
Post by TJ
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-wireles=
s" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
TJ
2009-03-25 04:31:16 UTC
Permalink
Hi,
This issue is confirmed here too. But during compile, there is warning about
cancel_work_sync(), cancel_delayed_work_sync() should be used instead.
And, without the modification in iwl3945_rfkill_poll() and
iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this issue
is also fixed in my testing.
Originally I did try adding just cancel_delayed_work_sync() although I
put it immediately following:

iwl3945_rfkill_unregister(priv);
cancel_delayed_work(&priv->rfkill_poll);

Wondered why that didn't work and hunted down the unnecessary calls to
queue_delayed_work() that were causing the issue and made them
conditional on STATUS_EXIT_PENDING.

Then I decided to be absolutely sure and moved cancel_delayed_work()
further down and for some reason mis-used cancel_work_sync() instead of
cancel_delayed_work_sync() - not paying attention I guess!



--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
reinette chatre
2009-03-26 19:39:03 UTC
Permalink
Post by TJ
Hi,
This issue is confirmed here too. But during compile, there is warning about
cancel_work_sync(), cancel_delayed_work_sync() should be used instead.
And, without the modification in iwl3945_rfkill_poll() and
iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this issue
is also fixed in my testing.
Originally I did try adding just cancel_delayed_work_sync() although I
iwl3945_rfkill_unregister(priv);
cancel_delayed_work(&priv->rfkill_poll);
Wondered why that didn't work and hunted down the unnecessary calls to
queue_delayed_work() that were causing the issue and made them
conditional on STATUS_EXIT_PENDING.
Then I decided to be absolutely sure and moved cancel_delayed_work()
further down and for some reason mis-used cancel_work_sync() instead of
cancel_delayed_work_sync() - not paying attention I guess!
Sounds like this can be done in one line ... could you please resubmit?

Thanks

Reinette

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
reinette chatre
2009-04-02 22:36:06 UTC
Permalink
Hi Huaxu,
Hi,
This issue is confirmed here too. But during compile, there is warning about
cancel_work_sync(), cancel_delayed_work_sync() should be used instead.
And, without the modification in iwl3945_rfkill_poll() and
iwl3945_mac_stop(), just with one line cancel_delayed_work_sync(), this issue
is also fixed in my testing.
TJ disappeared. We do need this fixed, would you like to submit your
patch instead?

Thanks

Reinette


--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Continue reading on narkive:
Loading...