Internet DRAFT - draft-guan-paws-smart-database

draft-guan-paws-smart-database



                                                              Jianfeng Guan 
                                                                 Neng Zhang 
                                                               Changqiao Xu 
                                                            Mingchuan Zhang 
    PAWS                                                       Hongke Zhang 
    Internet-Draft                                                     BUPT 
    Intended status: Informational                             Hongke Zhang 
    Expires: June 12, 2013                                December 12, 2013 
     
                                          
                           PAWS Smart Database 
                         draft-guan-paws-smart-database-00 


    Status of this Memo 

       This Internet-Draft is submitted in full conformance with the 
       provisions of BCP 78 and BCP 79.  

       Internet-Drafts are working documents of the Internet Engineering 
       Task Force (IETF), its areas, and its working groups.  Note that 
       other groups may also distribute working documents as Internet-
       Drafts. 

       Internet-Drafts are draft documents valid for a maximum of six 
       months and may be updated, replaced, or obsoleted by other documents 
       at any time.  It is inappropriate to use Internet-Drafts as 
       reference material or to cite them other than as "work in progress." 

       The list of current Internet-Drafts can be accessed at 
       http://www.ietf.org/ietf/1id-abstracts.txt 

       The list of Internet-Draft Shadow Directories can be accessed at 
       http://www.ietf.org/shadow.html 

       This Internet-Draft will expire on June 12, 2014. 

    Copyright Notice 

       Copyright (c) 2013 IETF Trust and the persons identified as the 
       document authors. All rights reserved. 

       This document is subject to BCP 78 and the IETF Trust's Legal
       Provisions Relating to IETF Documents
       (http://trustee.ietf.org/license-info) in effect on the date of
       publication of this document. Please review these documents
       carefully, as they describe your rights and restrictions with respect
       to this document. Code Components extracted from this document must
       include Simplified BSD License text as described in Section 4.e of
       the Trust Legal Provisions and are provided without warranty as
       described in the Simplified BSD License.

     
     
     
    <Guan, et al.>          Expires June 12, 2014                 [Page 1] 
     
    Internet-Draft         PAWS Smart Database                December 2013 
        

    Abstract 

       This document provides a Smart Database operation mechanism for PAWS. 
       By this mechanism the master device gets the optimized white space it 
       should communicate to in the regulatory domain. The mechanism is an 
       extension of protocol to access spectrum Database based on user 
       behavior analysis and machine learning concept. 

    Table of Contents 

        
       1. Introduction.............................................. ..2 
       2. Conventions used in this document............................3 
       3. Procedure Overview........................................ ..4 
          3.1. Problem Description.....................................4 
          3.2. Multi-Dimensional Aggregation Policy....................5 
          3.3. Data Preprocessing......................................6 
       4. Specification........................................... ....6 
          4.1. Feature Abstraction.....................................6 
          4.2. Dataset Training by Machine Learning Methods............8 
             4.2.1. User Behavior Clustering...........................8 
             4.2.2. Binary Prediction..................................8 
             4.2.3. Spectrum Service Recommendation....................8 
          4.3. Prediction Results......................................9 
       5. Working flow............................................ ....9 
          5.1. Spectrum prediction scenario............................9 
          5.2. WSDB Commendation Procedure............................10 
       6. Security Considerations.....................................10 
       7. IANA Considerations.................................... ....10 
       8. Conclusions............................................ ....11 
       9. References............................................. ....11 
          9.1. Normative References...................................11 
       10. Acknowledgments...................................... .....11 
       Authors'Addresses................................  ............12 
        
    1. Introduction 

       Nowadays, the conception of white space allocation and utilization 
       can come true due to the dynamic spectrum access technology. The 
       increasing spectrum allocation algorithms and industrial solutions 
       have been progressively proposed and implemented from lab to reality, 
       as well as gradually accepted standards presented by IETF working 
       group PAWS. In PAWS protocol, the Database is responsible for 
       spectrum allocation to the master device. However, there is an 

     
     
    <Guan>                Expires June 12, 2014               [Page 2] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       emerged problem that the user behavior of spectrum usage differs 
       from each other while the Database can exclusively distribute users 
       with same spectra stored in the server. This would be another kind 
       of waste due to such imbalanced spectrum usages. From another 
       perspective, although the white space could realize spectrum usage 
       diversity through dynamic random access, taking into account the 
       reasons for fairness allocation and security considerations, some 
       manual intervention and administrative controls are necessary to 
       coordinate spectrum resources intensively. Likewise, heavy 
       information overload caused by competition for one spectrum could be 
       balanced among multiple white spaces equilibration. 

       With respect to such diversified spectrum access motive and 
       management issues, some studies have been undertaken to optimize the 
       spectrum allocation while seldom would concentrate on the above 
       issues. The European FP7 FARAMIR project focuses on spectrum 
       measurement with performance characteristics, to increase the radio 
       environmental and spectral awareness under dynamic spectrum access 
       scenarios. Traffic management research and projects are being 
       carried out in international communications companies to realize 
       efficient spectrum utilization via cloud computing as user behavior 
       demand. But relevant standards have not yet appeared and so far 
       every user is subject to access static spectra with various services. 
       Obviously one format does not fit all. 

       Based on the above observation, we propose a Smart Database analysis 
       and operation mechanism for PAWS. Unlike previous work, our approach 
       allows to characterize spectrum usage behavior applied to different 
       purposes flexibly. The smart Database is proposed initially to 
       enable user behavior recognition and demand-driven spectrum 
       distribution. By this mechanism the master device and slave device 
       can get the optimized WSDBs to communicate with better quality of 
       Experience (QoE) in the regulatory domain. Our protocol is an 
       expansion of the existing PAWS protocol to boost advanced network 
       functions and spectrum usage efficiency. 

    2. Conventions used in this document 

       The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
       "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
       document are to be interpreted as described in RFC-2119 [RFC2119].  

       The terminology from PAWS: problem statement, use cases and 
       requirements PAWS RQMTS [PAWS RQMTS] is applicable to this document. 

       White Space Database Analysis Server (WSDB AS): 

     
     
    <Guan>                Expires June 12, 2014               [Page 3] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       This is a specific smart WSDB with cognitive ability and new 
       functionalities such as selecting history data to train samples and 
       learn the user behavior. This server acts as a smart analysis center, 
       abides by learning, control and coordination for white space, 
       benefits both to WSDB and clients. The server operates with three 
       functions: users and service clustering, service prediction with 
       data learning and recommendation analysis with collaborative 
       filtering. The primary goal is to provide the proper white space 
       spectrum towards users?access request. This server can be 
       integrated with a normal WSDB, or a standalone administrator with 
       other auxiliary management functions, depending on the regulatory 
       domain scope and performance requirement. 

       This draft is in scope for the reason that it could provide a group 
       of formatted information for querying the Database using a smart 
       method. Moreover, the device receives a list of available whitespace 
       frequencies at the specified condition with a probability. The 
       device can select a spectrum and send an acknowledgment to the 
       Database. To some extent, the Database can be more cognitive after 
       we expand the Database functions with regard of learning the user 
       condition when querying.  

    3. Procedure Overview 

    3.1. Problem Description 

       As previously mentioned, in current PAWS protocol, a typical case is 
       that if plenty of users are simultaneously allocated with a same 
       spectrum resource by Database, one that with small telephone traffic 
       would result in bandwidth surplus while others with video delivery 
       may suffer great QoS degradation due to interferences or limited 
       bandwidth. Our goal is to allocate spectrums based on their 
       attribute and usage behavior. 

       For instance, some low frequency bands with long wavelength are fit 
       for coverage, while some others for capacity, or suitable for large 
       video transmission. Relying on user behavior analysis, a smart 
       Database can recognize and match, make decisions to select the 
       appropriate spectrum for users. Such automatic configuring functions 
       also conform to an especially vital concept in future software-
       defined network and software-defined radio trend.  

       To realize such functions, we attempt to employ machine learning 
       methods to capture the user behavior pattern based on two reasons. 
       Firstly, various mobile communication services show different and 
       subtle characteristics which are hard to analyze by one simple 

     
     
    <Guan>                Expires June 12, 2014               [Page 4] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       intelligent algorithm. Secondly, the requirement in big data era 
       make machine learning methods over perform than other learning 
       methods such as reinforce learning or correlation analysis to some 
       extent. 

       We classify the general procedure as sensing, deciding and 
       recognition. First we will describe the label selection. Then we 
       will discuss the data mining methods. Correspondingly, the 
       prediction results will be given later. After the analysis of the 
       WSDb AS, the protocol interaction with users will be showed along 
       with newly added optimized parameters. 

    3.2. Multi-Dimensional Aggregation Policy 

       For the purpose of management assumption, the AS can be deployed in 
       different platforms to send the results to Master Device uniformly 
       or Slave Device directly. Especially for AS location, it can be 
       deployed on Master Device, telecommunications or Internet 
       enterprises. To obtain the multi-dimensional user data samples, a 
       serious of packets inspection or traffic monitor tools and our smart 
       analysis function can be combined to deeply probe potential demand 
       of bandwidth and service for users. Further in view of community 
       benefit, we collect and aggregate data flows by five common 
       deploying policies: 

       (1)  data flows for users that share a master device. The Database 
            can be deployed on a base station for real-time analysis and 
            computing. This is a most basic method to manage the spectrum 
            occupancy and redistribution. 

       (2)  data flows for users that go to a same master device. On 
            account of security consideration for traffic volume, we can 
            allocate some kind of white space such as a trust channel to 
            users. 

       (3)  data flows that pass through a backbone network or a 
            telecommunications. This is another common performing method 
            for commercial value promotion of spectrum and traffic and 
            bandwidth planning.   

       (4)  data flows on an Internet enterprise such as Youtube, Facebook. 
            Take Youtube for example, users that request for one video can 
            be aggregate to cultivate one behavior habit and distribute 
            them a relatively large bandwidth. 

    
     
    <Guan>                Expires June 12, 2014               [Page 5] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       (5)  data flows for users applying a same application such as 
            WeChat. The spectrum and traffic demand would vary from one 
            service to another, or even one application may contain a 
            serious of service such as video, audio and text. The spectrum 
            features can be utilized to pack to a bundle of functions. 

       Although the data can be easily collected by these policies, limited 
       by respective business area of companies, the potential value of 
       data cannot be immensely released. Thus, data mining can be explored 
       more sufficiently based on the cooperation of these entities. 

    3.3. Data Preprocessing 

       In future ubiquitous network era, personal traffic volume may be all 
       kinds of information sources including sound, image, video, 
       fingerprint, product information, biological information or brain 
       wave. Those will traverse among countless user equipments and make 
       it more difficult to organize. In our model, we adopt machine 
       learning algorithms to abstract user behavior features and predict 
       the spectrum usage. In this preprocessing step, our goal is to 
       normalize the messy data into a training dataset. 

       Firstly, we adopt general cloud technologies such as HDFS and Map-
       reduce methods to perform segmented metadata storage. Then the raw 
       data would be aggregated into a dataset with one policy above. After 
       cloud processing and data cleaning, different types of data could be 
       normalized into structured data. In some cases, only a few samples 
       can be trained to predict for small data size. Otherwise random data 
       sampling would be required to reduce the big data complexity.  

    4. Specification 

       Here is how the Database trains the datasets and predicts a suitable 
       white space to assign. A general procedure is to abstract features, 
       train datasets and predict new data results. It would be a great 
       utilization for scaling and parallelizing machine learning 
       algorithms on big data inside the cloud. 

    4.1. Feature Abstraction 

       The user features and parameters selection comply with a general 
       unsupervised modeling process. Common feature selection and feature 
       extraction methods such as Filter, Principal Components Analysis 
       (PCA) and Singular Value Decomposition (SVD) are feasible to find 
       significant feature training subsets to some extent. Unlike 
       traditional wireless resource distribution conditions, the features 

     
     
    <Guan>                Expires June 12, 2014               [Page 6] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       in white space access would be more complicated. Here we elaborate 
       several typical features on behalf of user behaviors. 

       1 Geocation: It is noted that available spectrums are often sensed 
       in a limited area so that topographic information of slave device 
       would affect white space quality and selection. Specific geographic 
       information for the latitude and longitude of the antenna height, 
       etc., can be quantized into a value as a characteristic for data 
       learning.  

       2 Time label: this feature is composed of two variables. On one hand, 
       the different levels of time scale affect the user behavior pattern. 
       For example, in the beginning of a month, enough monthly mobile data 
       plans may not impel users to intensely seek other resources, thus 
       less frequency hopping in the beginning and similarly more white 
       space requests in the end of a month. Moreover, the spectrum 
       requirement varies in one day. On the other hand, the spectrum 
       occupancy behavior is also influenced by usage time interval. 
       According to the timestamp, this value could be quantized as 
       accurate to minute and time scale would be quantized as every hour 
       of subsection in a month. 

       3 service types: With respect to numerous applications such as 
       streaming video, Voice over IP (VoIP), e-commerce, Enterprise 
       Resource Planning (ERP) and others, we intend to differentiate them 
       so as to provide a better QoE in addition to best-effort service. 
       Obviously, different applications have variable demands for delay, 
       jitter, bandwidth, packet loss, and availability. Referred to the 
       definition of RFC 4594 and 5127, in view of tolerance to packet loss, 
       delay and jitter, we classify customer service as four types, ten 
       classes with priority values. Meanwhile, referred to the standards 
       of operators and other entities, service types can be classified 
       more flexibly. 

       4 roam state: this is also a two dimensional feature which have a 
       current roaming state and a handoff frequency of one device out and 
       in a resident area. 

       It is believable that as increasing mobile apps and services emerge, 
       more features like biological data will be introduced into training 
       sets so as to redefine the feature abstraction criterion with 
       machine learning. 





     
     
    <Guan>                Expires June 12, 2014               [Page 7] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

    4.2. Dataset Training by Machine Learning Methods 

       This step is to train the established datasets and validate the test 
       results. The primary goal is to predict a most suitable white space 
       according to the user behavior condition. Moreover, other suitable 
       service can be predicted and recommended as well to fulfill the user 
       potential requirement. 

       For user behavior analytics in traditional wireless network or small 
       scale of user quantity, common clustering methods would meet 
       classification or prediction requirements. With the tremendous 
       information explosion and growth of data volume, in the light of 
       different application purposes, it is necessary to utilize more 
       scalable-parallel machine learning tools and methods aiming at such 
       big data. Relevant big data and cloud analytics technologies can be 
       referred to general industry standards. The user data can be also 
       divided locally based on neighborhood similarity for parallelizing 
       process on big data by machine learning methods. Likewise, the 
       Database could be locally distributed in some scale to carry out 
       dataset training.  

       The specific methods can be classified according to the following 
       three analytic models. 

    4.2.1. User Behavior Clustering 

       The clustering technologies aim to aggregate several items by 
       likelihood and similarity. In our protocol, this kind of methods can 
       be used to aggregate users with similar behavior. Then we execute 
       same actions to this cluster of users like uniform spectrum 
       distribution. This is a basic  

    4.2.2. Binary Prediction 

       We mainly exploit this learning process model to make decisions and 
       predict a spectrum with confidence or probability. Muilti-ruleset 
       data mining tools such as sparse Bayesian methods and kernel based 
       methods could be prior implemented to give a better prediction 
       results. 

    4.2.3. Spectrum Service Recommendation 

       The goal for this model is to predict and recommend a service for 
       multiple users with similar behavior. Information filtering 
       technologies and recommender systems based on similarity could match 
       users with spectrum and service they most likely to be interested by 

     
     
    <Guan>                Expires June 12, 2014               [Page 8] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       some kind of scoring mechanism. Muilti-ruleset collaborative 
       filtering could be implemented to compute these preferences and 
       recommend spectrums or other user-oriented service such as data 
       traffic plans. Moreover, such a correlation and filtering mechanism 
       could monitor the spectrum usage mass activity to prevent malicious 
       users?cooperative attack.   

    4.3. Prediction Results 

       When new spectrum request coming, the Database could abstract user 
       features mentioned above, predict the spectrum based on the trained 
       model. Since such a spectrum is the one that most suitable or 
       frequently-used, the Database can directly response a best candidate 
       spectrum or spectrum lists with probability, instead of a random 
       selected available spectrum list. This also ensures to access a 
       stable and trusted Database out of security consideration. Manual 
       operation would be permitted and pre-built in Database. Similarly, 
       other recommended output results can be pushed via spectrum response. 
       Predicted results could be added to training datasets to improve the 
       prediction accuracy as well as automatically adjust the false alarm 
       rate to adapt the fitting. 

       An alternative recommendation method is that when a requested 
       spectrum period is expired, a master device quits the spectrum 
       occupancy and sends a spectrum feedback to the WSDB. This feedback 
       is marked as an evaluation degree to describe the satisfaction for 
       this white space access. If the number is frequently higher 
       statistic, then this spectrum will be top-ranked and prior allocated 
       to other users for the next time. 

    5. Working flow 

       This section we will introduce the system implement architecture. 
       Our Database should be locally distributed to solve the mobility and 
       scalability problems. Since node mobility management issues will 
       involve the related registration and termination problems, 
       localization can relieve low latency queries and scalability issues. 
       These also bring advantages that the big data can be learned in 
       portion and integrated for varigrained analysis freely by 
       transforming between lower and higher dimensional data space. 

    5.1. Spectrum prediction scenario 

     
     
    <Guan>                Expires June 12, 2014               [Page 9] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

        
         +-----------+             +-----------+          +----------+ 
         |           |             |           |          |   WSDB   | 
         |    WSD    |             |    WSDB   |          | Analysis | 
         |           |             |           |          |  Server  | 
         +-----------+             +-----------+          +----------+ 
             |                           |     all users history     | 
             |                           |    feature abstraction    | 
             |                           |---------------------------| 
             |                           |                           | 
             |                           |dataset training & modeling| 
             |                           |---------------------------| 
             |                           |                           | 
             |   AVAIL_SPEC_BATCH_REQ    |                           | 
             |-------------------------->|                           | 
             |                           |     feature abstraction   | 
             |                           |   & spectrum prediction   | 
             |                           |<------------------------->| 
             |AVAIL_SPEC_BATCH RESP with |                           | 
             |    predicted spectrum     |                           | 
             |<--------------------------|                           | 
             |                           |                           | 
          Figure 1 Procedures of WSD gets predicted spectrum from WSDB 

       From the Figure 1 we can see that the Database is no need to check 
       the current available spectrum for every white space device. Or even 
       we can trace the user activity behavior and preset the likely used 
       spectrum for a series of users. In this way, it will shorten the 
       query delay and resource lookup cost with access to an optimized 
       spectrum in return. 

    5.2. WSDB Commendation Procedure 

    6. Security Considerations 

       With regard of the security assumption in user case requirements, 
       the Master Device and the Database may suffer six types of threats. 
       Without additional message interaction, our protocol will not 
       introduce new intercept risks. Moreover, a crowd of malicious 
       attackers could be easily identified since they would act with 
       similar behavior.  

    7. IANA Considerations 

       This document makes no request of IANA. 


     
     
    <Guan>                Expires June 12, 2014              [Page 10] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

    8. Conclusions 

       This memo discusses a smart Database functions during white space 
       database access and describes some scenarios. 

    9. References 

    9.1. Normative References 

       [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 
                 Requirement Levels", BCP 14, RFC 2119, March 1997. 

       [RFC3339] Klyne, G., Ed. and Newman, C., "Date and Time on the 
                 Internet: Timestamps", RFC 3339, July 2002. 

       [RFC4594] Babiarz, J., Ed. and Chan, K., "Configuration Guidelines 
                 for DiffServ Service Classes", RFC 4594, August 2006. 

       [RFC5127] Chan, K., Ed. And Baker, F., "Aggregation of Diffserv 
                 Service Classes", RFC 5127, February 2008. 

       [I-D.ietf-paws-protocol] Chen, V., Das, S., Zhu, L., Malyar, J., and 
                 P. McCann,"Protocol to Access Spectrum Database",Draft-
                 ietf-paws-protocol-03(work in progress),February 2013. 

       [I-D.das-paws-protocol] Das, S., Malyar, J., and D. Joslyn, "Device 
                 to Database Protocol for White Space", draft-das-paws-
                 protocol-02(work in progress), July 2012. 

       [I-D.ietf-paws-problem-stmt-usecases-rqmts] Mancuso, A. and B. Patil, 
                 "Protocol to Access White Space (PAWS) Database: Use Cases 
                 and Requirements", draft-ietf-paws-problem-stmt-usecases-
                 rqmts-12 (work in progress), January 2013. 

       [I-D.wei-paws-framework] Wei, X., Zhu, L., and P. McCann, "PAWS 
                 Framework", draft-wei-paws-framework-00 (work in progress), 
                 July 2012. 

    10. Acknowledgments 

       Thanks to my colleagues for their sincerely contributions and 
       comments when drafting this document. 


     
     
    <Guan>                Expires June 12, 2014              [Page 11] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       Authors' Addresses 

       Jianfeng Guan 

       State Key Laboratory of Networking and Switching Technology 

       Beijing University of Posts and Telecommunications,  

       Beijing, 100876, P.R.China  

        

       EMail: jfguan@bupt.edu.cn 

        

       Neng Zhang  

       State Key Laboratory of Networking and Switching Technology 

       Beijing University of Posts and Telecommunications,  

       Beijing, 100876, P.R.China  

        

       EMail: zn@bupt.edu.cn 

        

       Changqiao Xu 

       State Key Laboratory of Networking and Switching Technology 

       Beijing University of Posts and Telecommunications,  

       Beijing, 100876, P.R.China  

        

       EMail: cqxu@bupt.edu.cn 

        

       Hongke Zhang  

     
     
    <Guan>                Expires June 12, 2014              [Page 12] 
        
    Internet-Draft       PAWS Smart Database              December 2013 
        

       State Key Laboratory of Networking and Switching Technology 

       Beijing University of Posts and Telecommunications,  

       Beijing, 100876, P.R.China  

        

       EMail: hkzhang@bupt.edu.cn 

        



































     
     
    <Guan>                Expires June 12, 2014              [Page 13]