> The codes include the detailed comments to elucidate prevalent details:
Please note that the data is encoded and compressed.
> We use two datasets for our experiments.:
The Stack Overflow dataset from the Google's BigQuery. Download here.
Yahoo answers Network Flows Data, version 1.0 (multi part) (Hosted on AWS) which is a private dataset and can be requested in here.
The case study:
Random Question used in case study: [SEN] i m starting/VBG to/TO really/RB love/VB extension/NN \n methods/NNS [DOT] i was/VBD wondering/VBG if/IN anyone/NN \n her/PRP$ has/VBZ stumbled/VBN upon/IN one/CD that/WDT \n really/RB blew/NN their/PRP$ mind/NN or/CC just/RB found/VBN \n clever/RB [DOT] an/DT example/NN i wrote/VBD today/NN [DOT] \n i can/MD t wait/VB to/TO see/VB other/JJ examples/NNS enjoy/VBP \n [DOT][ESEN]
Link: Selected True Positive User With Higher Rank and False Positive Experts With Lower Rank
True Positive Expert gains a higher rank ==> Reason: Active temporal pattern and relevance to the temporal pattern of the question's author.
Please note that the False Positive Expert provides responses and does not have any questions in the domain knowledge of the proposed question. So the empty file of Mid-Ranked-False-Positive-Expert-Questions.csv is not included.
Relation to
our previous
TKDE, DMKD, and WWWJ articles
Our initial works include detecting the concepts from a single short text using an external knowledge base (KB) [1] and identifying concepts [2] through unsupervised clustering in the tweet space, thus eliminating the bias and deviation caused by a KB. Other previous articles of ours [3][4] recommend temporal-textual embedding models to manually track dynamic perturbations in the short text contents using discrete-time.
Following our prior efforts, this paper leverages large-scale brief contents to recommend the most suitable users to answer the given query based on the expected time constraint. In addition, we empirically experience that Fourier transformers can automatically infer multi-aspect base signals and overpass manual discrete-time models in obtaining time-specific user profiles.
[1] Hua, W., Wang, Z., Wang, H., Zheng, K., and Zhou, X. (2016). Understand short texts by harvesting and analyzing semantic knowledge. IEEE Trans. on Knowledge and Data Engineering (TKDE), 29(3), 499-512.
[2] Hosseini, S., Yin, H., Zhou, X., Sadiq, S., Kangavari, M. R., and Cheung, N. M. (2019). Leveraging multi-aspect time-related influence in location recommendation. World Wide Web (WWWJ), 22(3), 1001-1028.
[3] Najafipour, S., Hosseini, S., Hua, W., Kangavari, M. R., and Zhou, X. (2020). SoulMate: Short-text author linking through Multi-aspect temporal-textual embedding. IEEE Transactions on Knowledge and Data Engineering (TKDE).
[4] Hosseini, S., Najafipour, S., Cheung, N. M., Yin, H., Kangavari, M. R., and Zhou, X. (2020). TEAGS: time-aware text embedding approach to generate subgraphs. Data Mining and Knowledge Discovery (DMKD), 34, 1136-1174.
Dr. Saeid Hosseini, saeid.hosseini@uq.net.au
Mohsen Saaki, mohhsensaaki@gmail.com
Sana Rahmani, rahmany.sana@gmail.com