Loading [MathJax]/jax/output/CommonHTML/jax.js
본문 바로가기
네트워크/논문 분석·리뷰

[Wifi] Learning-Based Spatial Reuse for WLANs With Early Identification of Interfering Transmitters

by 메릴린 2023. 3. 7.
728x90

Learning-Based Spatial Reuse for WLANs With Early Identification of Interfering Transmitters

Preliminaries: Early Identification of Interfering Transmitters

State

Using MDP

  • four tuple (Ω,\Alpha,q,R)
  • union and the Cartesian product
    • ΩMAC:=S0,S1,S2,S3
    • ΩBS
      • the current backoff stage
      • the times of consecutive transmission failure at present
    • ΩCH:=0,1,2,,N
      • index of transmitting interferer that identified by the agent
      • ωCH[t]=0 : “the channel is ide”l or “the interferer is unable to be identified”
    • ΩDR:=1,,K
      • the number of available MCS
      • the currently chosen data rate for transmission

Action

  1. Select Data Rate (in S0)
  2. Choose Whether or not go ignore detected transmission / adjust data rate (in S2)
  3. Continue carrier sensing (in S1 , backoff counter is still not 0)

Metric and Reward

Given that the agent has successfully transmitted a packet after J **times of consecutive packet transmission failures, the service time D

  • Cj : the duration of the unsuccessful transmission in backoff stage j
  • TJ : the duration of the successful transmission
  • Bj : the backoff countdown duration in backoff stage j
  • Y : the number of times that agent has freezed its backoff counter
  • Fi : the duration that the agent freezes its backoff counter

⇒ 즉, 새로운 Packet이 생성되고 나서부터 성공하기까지 (ACK reicept까지) 걸리는 시간

reward

  • when transmission failed (from S1 to S0)
    • BjCj
  • when transmission succeeded (from S1 to S3)
    • BJTJ
  • when it has fronzen the backoff counter to wait until the detected transmission ends
    (when a=0, from S2 to S1)
    • Fi

Learning-Based Spatial Reuse Operation

Learning Algorithm

  • RUQL (Repeated )
    • learning rate 조절
    • 덜 탐색되는 action에 higher learning rate를 부여
    • αn : the learning rate in the conventional QL algorithm
  • ϵ-greedy exploration policy

Transmit Power Restriction

concurrent transmission에서 on-going transmissions를 보호하기 위해 transmit power를 낮춘다.

  • Pref : maximum possible transmit power of the agent
  • Θmin=82dBm
    • default CCA threshold of legacy devices
  • I : measured interference strength

⇒ inversely proportional to the detected interference strength

Numerical Evalution

  • Throughput
  • MAC Service Time Composition
  • Performance Gains Due to Identifying Interferers
  • Time-Varying Topology
    • change the location once a second
  • Impact to Legacy Transmitters
    • evaluate the percentage of packets transmitted by the OBSS transmitters that are corrupted by the transmission of the agent.
  • Multiple Agents

Analysis of Gains Due to Identifying Interferers

  • State Partition : Stationary MDP
  • Analysis of Gains Due to Identifying Interferers

Points

  • agent가 현재 topology에 놓인 상황을 state로 표현
  • Agent의 MAC service time을 줄이고자 하는 것이 목적
  • agents가 10개인 Multi-Agents 환경에 대해서도 실험
    • 각 agent selfish
  • The Partitioned MDP의 사용
    • But identifier는 구분하지 않고 단순화 함
    • learning algorithm과 simulation evaluation에서는 사용되지 않음

Questions

🧐 왜 모든 reward 값을 음수로 설정했는가? 이는 모든 agent의 action이 agent의 goal을 방해한다는 것을 의미하기에 좀 이상한 것 같다.

🧐 왜 adjusting transmit power에 proportional을 썼을까? proportional fairness의 의미?

🧐 Fig 8.의 Throughput 차이가 큰 의미가 있는가? (4개의 transmitters, Mbit/s 10정도의 차이)

728x90
반응형

댓글