CN102760130B - The method and apparatus of process information - Google Patents

The method and apparatus of process information Download PDF

Info

Publication number
CN102760130B
CN102760130B CN201110107521.3A CN201110107521A CN102760130B CN 102760130 B CN102760130 B CN 102760130B CN 201110107521 A CN201110107521 A CN 201110107521A CN 102760130 B CN102760130 B CN 102760130B
Authority
CN
China
Prior art keywords
information
return information
judged result
junk
described return
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201110107521.3A
Other languages
Chinese (zh)
Other versions
CN102760130A (en
Inventor
周文江
李勤学
郑志昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201110107521.3A priority Critical patent/CN102760130B/en
Publication of CN102760130A publication Critical patent/CN102760130A/en
Application granted granted Critical
Publication of CN102760130B publication Critical patent/CN102760130B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of method and apparatus processing information, belong to computer realm.Described method includes: receives any user return information to user-generated content, and judges whether described return information is junk information;If it is, described return information is grouped in the classification of junk information, makes all users seeing described return information in the classification of described junk information that described return information is carried out secondary judgement, and record described judged result;Statistics sees all users of the described return information number that judged result is normal information to described return information in the classification of described junk information, it is judged that whether the number that judged result is normal information of described return information is met the threshold value preset;If it is, described return information is labeled as normal information.

Description

The method and apparatus of process information
Technical field
The present invention relates to computer realm, particularly to a kind of method and apparatus processing information.
Background technology
Along with the development of the Internet, the arrival in particularly Web 2.0 epoch, network application is increasingly deep into people Daily life, various UGC (User Generated Content, user-generated content) apply Become people's record and show oneself, promoting one Important Platform of emotional interaction between kith and kin, such as individual empty Between, the network application such as including blog, message board, microblogging, photograph album.But thing followed SPAM (believe by rubbish Breath), more rampant, i.e. malicious user may insert advertisement in UGC applies, or replys malice Information.These SPAM contents not only expend a large amount of Internet resources, and the online affecting numerous netizens is experienced, And bring economic loss may to the netizen that safety consciousness is the highest, in some instances it may even be possible to cause society's unstable factor.
In UGC application at present, the decision scheme to SPAM is substantially feature based on content to be determined, bag Include the keyword in humanized, the content of posting, content repetition degree etc..SPAM decision-making system initially sets up one The weight table of characteristic item, when a new UGC information arrives, extracts and all features in calculating information Value, and by the Weight summation of each characteristic item, obtain the whether SPAM about this UGC information Probit, when probit is more than the threshold values that is pre-configured with, then it is assumed that this UGC information is SPAM, is System removes (or marking) automatically from user data block, makes the user delivering UGC can't see this rubbish letter Breath.
After being analyzed prior art, it is existing that inventor finds that prior art at least has the drawback that SPAM decision scheme is only relevant with UGC content itself, whether judges it according to the content characteristic of UGC For junk information, but for same UGC content, although system determines that it is SPAM, some users May be also considered as it is SPAM, but other users may think that it is normal information, is not SPAM.Cause This existing decision procedure to SPAM, owing to not accounting for the idea of user, often occur erroneous judgement and Fail to judge situation, particularly judge by accident, greatly the emotion of harm users, and fail to judge and allow SPAM maker deposit Fluke mind, continues to be engaged in this illegal work, disturbs normal network environment.
Summary of the invention
In order to determine SPAM more accurately, embodiments provide a kind of method processing information and Device.Described technical scheme is as follows:
On the one hand, it is provided that a kind of method processing information, described method includes:
Receive the return information of any user user-generated content to delivering on the internet, and judge described Whether return information is junk information;
Record described judged result, make all users seeing described return information that described return information to be carried out Secondary judges, and records the secondary judged result of described all users;
According to the secondary judged result of described all users, described return information is labeled as normal information or Junk information;
Described record described judged result, make all users seeing described return information to described return information Carry out secondary judgement, and record the secondary judged result of described all users, including:
If described judged result is yes, then described return information is grouped in the classification of junk information, makes The classification of described junk information being seen, all users of described return information carry out secondary to described return information Judge, and record the judged result of described all users;
Correspondingly, the described secondary judged result according to described all users, described return information is labeled as Normal information, including:
Statistics sees that in the classification of described junk information all users of described return information reply letter to described The number that judged result is normal information of breath, it is judged that the judged result to described return information is normal information Number whether meet the threshold value preset;
If it is, described return information is labeled as normal information;
Described described return information is labeled as normal information, the most also includes:
Judge whether described return information has sensitive information;
If it is, described return information is audited again, audit by after be further continued for perform step;
When delivering user, the described return information being labeled as normal information of personal space is labeled as junk information Time, do not show described return information at described personal space;
It is labeled as normally believing by the described return information being labeled as junk information of personal space when delivering user Breath, and when described return information does not has sensitive information, show described return information at described personal space;
It is labeled as normally believing by the described return information being labeled as junk information of personal space when delivering user Breath, and when described return information has sensitive information, described return information is audited again, examination & verification is logical Later described return information is shown at described personal space.
Described judge whether described return information is junk information, including:
Extract each characteristic item of described return information;
To described each characteristic item weighted sum, obtain the probit whether described return information is junk information;
If the described probit obtained is more than pre-set threshold value, the most described return information is junk information.
Described described return information is labeled as normal information, the most also includes:
Record sees that described junk information is judged as the result of normal information by the user of described return information;
The judged result of the described user according to described record adjusts the weight of each characteristic item of return information.
Described record described judged result, make all users seeing described return information to described return information Carry out secondary judgement, and record the secondary judged result of described all users, including:
If described return information is not junk information, then described return information is grouped into the classification of normal information In, make all users seeing described return information in the classification of described normal information to described return information Carry out secondary judgement, and record the judged result of described all users;
Correspondingly, the described secondary judged result according to described all users, described return information is labeled as Junk information, including:
Statistics sees that in the classification of described normal information all users of described return information reply letter to described The number that judged result is junk information of breath, and judge to be rubbish letter to the judged result of described return information Whether the number of breath meets the threshold value preset;
If it is, described return information is labeled as junk information.
Described described return information is labeled as junk information, the most also includes:
Record sees that described normal information is judged as the result of junk information by the user of described return information;
The judged result of the described user according to described record adjusts the weight of each characteristic item of return information.
On the other hand, it is provided that a kind of device processing information, described device includes:
First judge module, for receiving returning of any user user-generated content to delivering on the internet Complex information, and judge whether described return information is junk information;
Logging modle, is used for recording described judged result, makes all users seeing described return information to institute State return information and carry out secondary judgement, and record the secondary judged result of described all users;
Second judge module, for the secondary judged result according to described all users, by described return information It is labeled as normal information or junk information;
Described logging modle, specifically for:
If the judged result of described first judge module is yes, then described return information is grouped into junk information Classification in, make all users seeing described return information in the classification of described junk information to described time Complex information carries out secondary judgement, and records the judged result of described all users;
Correspondingly, described second judge module, specifically for:
Statistics sees that in the classification of described junk information all users of described return information reply letter to described The number that judged result is normal information of breath, it is judged that the judged result to described return information is normal information Number whether meet the threshold value preset;If it is, described return information is labeled as normal information;
Described device also includes:
3rd judge module, before being labeled as normal information by described return information, it is judged that described reply Whether information there is sensitive information;If it is, again audit described return information, examination & verification is passed through After be further continued for perform step;
Described second judge module is additionally operable to, when delivering user by the institute being labeled as normal information of personal space State return information when being labeled as junk information, do not show described return information at described personal space;When delivering The described return information being labeled as junk information of personal space is labeled as normal information, and described time by user When complex information does not has sensitive information, show described return information at described personal space;Will when delivering user The described return information being labeled as junk information of personal space is labeled as normal information, and described return information In when having sensitive information, described return information is audited again, audit by after at described personal space Show described return information.
Described first judge module, including:
Extraction unit, for extracting each characteristic item of described return information;
Computing unit, for described each characteristic item weighted sum, obtains whether described return information is rubbish The probit of information;
Judging unit, if the described probit obtained for described computing unit is more than pre-set threshold value, then institute Stating return information is junk information.
Described device also includes:
First adjusting module, after described return information is labeled as normal information, record is seen described Described junk information is judged as the result of normal information by the user of return information;According to described record The judged result of user adjusts the weight of each characteristic item of return information.
Described logging modle, specifically for:
If the judged result of described first judge module is no, then described return information is grouped into normal information Classification in, make all users seeing described return information in the classification of described normal information to described time Complex information carries out secondary judgement, and records the judged result of described all users;
Correspondingly, described second judge module, specifically for:
Statistics sees that in the classification of described normal information all users of described return information reply letter to described The number that judged result is junk information of breath, and judge to be rubbish letter to the judged result of described return information Whether the number of breath meets the threshold value preset;If it is, described return information is labeled as junk information.
Described device also includes:
Second adjusting module, after described return information is labeled as junk information, record is seen described Described normal information is judged as the result of junk information by the user of return information;According to described record The judged result of user adjusts the weight of each characteristic item of return information.
The technical scheme that the embodiment of the present invention provides, selects SPAM in auxiliary judgment UGC content according to user Method, user can be allowed to see and the system that participates in is in the judgement of SPAM, improve system in UGC Accuracy that in appearance, SPAM judges and judge speed, and all replies in personal space can carry by user Hand over the judgement of oneself, have an opportunity the system to judge by accident, the content of non-SPAM brings order out of chaos regular content classification Under, it is also possible to the wall scroll in oneself space is replied and classifies as SPAM, improve the accurate rate that SPAM judges, Promote Consumer's Experience.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to enforcement In example or description of the prior art, the required accompanying drawing used is briefly described, it should be apparent that, describe below In accompanying drawing be only some embodiments of the present invention, for those of ordinary skill in the art, do not paying On the premise of going out creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the flow chart of a kind of method processing information that the embodiment of the present invention 1 provides;
Fig. 2 is the flow chart of a kind of method processing information that the embodiment of the present invention 2 provides;
Fig. 3 is the schematic diagram of a kind of device processing information that the embodiment of the present invention 3 provides;
Fig. 4 is the schematic diagram of the device of the another kind of process information that the embodiment of the present invention 3 provides.
Detailed description of the invention
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing to the present invention Embodiment is described in further detail.
Embodiment 1
See Fig. 1, embodiments provide a kind of method processing information, including:
Step 101: receive any user return information to user-generated content, and whether judge return information For junk information;
Step 102: record judged result, makes all users seeing return information that return information is carried out secondary Judge, and record the secondary judged result of all users;
Step 103: according to the secondary judged result of all users, return information is labeled as normal information or Junk information.
Wherein, it is judged that whether described return information is junk information, including:
Extract each characteristic item of described return information;
To described each characteristic item weighted sum, obtain the probit whether described return information is junk information;
If the described probit obtained is more than pre-set threshold value, the most described return information is junk information.
Wherein, according to the secondary judged result of all users, return information is labeled as normal information or rubbish Rubbish information, including:
If it is judged that be yes, then return information be grouped in the classification of junk information, make in junk information Classification in see that all users of return information carry out secondary judgement to return information, and record all users Judged result;
Correspondingly, according to there being the secondary judged result of user, return information is labeled as normal information, including:
Statistics sees all users of the return information judged result to return information in the classification of junk information Number for normal information, it is judged that whether the number that judged result is normal information of return information is met pre- If threshold value;If it is, return information is labeled as normal information.
Alternatively, return information is labeled as normal information, the most also includes:
Judge whether return information has sensitive information;
If it is, return information is audited again, audit by after be further continued for perform step.
Alternatively, return information is labeled as normal information, the most also includes:
Record sees that junk information is judged as the result of normal information by the user of return information;
The judged result of the user according to record adjusts the weight of each characteristic item of return information.
Further, secondary judged result according to all users in the present embodiment, return information is labeled as Normal information or junk information, including:
If return information is not junk information, then return information is grouped in the classification of normal information, makes The classification of normal information being seen, all users of return information carry out secondary judgement, and record to return information The judged result of all users;
Correspondingly, according to there being the secondary judged result of user, return information is labeled as junk information, including:
Statistics sees all users of the return information judgement to described return information in the classification of normal information Result is the number of junk information, and judges number that the judged result to return information is junk information whether Meet the threshold value preset;
If it is, return information is labeled as junk information.
Alternatively, return information is labeled as junk information, the most also includes:
Record sees that normal information is judged as the result of junk information by the user of return information;
The judged result of the described user according to record adjusts the weight of each characteristic item of return information.
The technical scheme that the embodiment of the present invention provides, selects SPAM in auxiliary judgment UGC content according to user Method, user can be allowed to see and the system that participates in is in the judgement of SPAM, improve system in UGC Accuracy that in appearance, SPAM judges and judge speed, and all replies in personal space can carry by user Hand over the judgement of oneself, have an opportunity the system to judge by accident, the content of non-SPAM brings order out of chaos regular content classification Under, it is also possible to the wall scroll in oneself space is replied and classifies as SPAM, improve the accurate rate that SPAM judges, Promote Consumer's Experience.
Embodiment 2
See Fig. 2, embodiments provide a kind of method processing information, including:
Step 201:SPAM decision-making system receives the return information of UGC.
In the present embodiment, owing to UGC delivers on the internet, the user on the Internet can be by certain Approach sees this UGC, and replys, and wherein the user on the Internet may have this to deliver the good friend of user, Advertiser or other malicious users are likely to it.In order to protect the clean and tidy of individual subscriber space and safety, UGC Each reply need to first pass through SPAM system and verify, to judge that whether it is as junk information.
Step 202: judge whether this return information is junk information, if it is, perform step 203, no Then perform step 206.
Wherein, SPAM decision-making system initially sets up the weight table of a characteristic item, is receiving the reply of UGC After information, extract the correlation properties item of return information according to the record of this characteristic list item, thus judge this reply Whether information is junk information, specifically includes:
1) extract and the value of all characteristic items that calculates in return information;
2) Weight of each characteristic item is sued for peace, obtain the probit whether return information is junk information;
3) if the probit obtained is more than pre-set threshold value, then return information is junk information, otherwise replys letter Breath is normal information.
In the present embodiment, alternatively, for being judged to the content of SPAM, system also can this content of labelling Whether comprise sensitive information, in order to when user judges again, if this SPAM is judged to non-SPAM, The content comprising sensitive information can be judged again.
Step 203: return information is grouped in junk information classification, makes to see back in the classification of junk information All users of complex information carry out secondary judgement to return information, and record the judged result of all users.
Wherein, system is after judging that this return information is SPAM or non-SPAM, alternatively, by this time Complex information be grouped into junk information classification in or normal information classification in, make user when opening personal space, Which can be immediately seen is junk information, and which is non-spam, the most further to returning Complex information is verified.Wherein it is possible to set up different respectively for junk information classification and normal information classification File, it would however also be possible to employ classify both information otherwise, does not do specifically this present embodiment Limit.
Step 204: statistics sees that in the classification of junk information all users of return information are to return information Judged result is the number of normal information.
In the present embodiment, for the impression of the user that thinks in terms of the majority, will not be according to certain user to return information Judged result, just return information is reclassified, but threshold value is previously set, when hold identical judge knot When the number of fruit meets default value, just return information is reclassified.So, system can record each use The secondary judged result at family, and add up the number of the result of different judgement.
Step 205: judge whether the number that the judged result to return information is normal information meets the threshold preset Value, if it is, be labeled as normal information by return information;Otherwise this return information is deleted.
In the present embodiment, after the return information of UGC is judged by SPAM decision-making system, reply is believed Breath is classified, and is all existed by the return information of the UGC received in the personal space of user, when user visits When asking space, it can be seen that all of return information in different classification results, further, user couple The return information that system has made a determination judges again, makes user will not miss any one for this It is available information for user.As, after system judges that this return information is as junk information, it is classified as Junk information one class, then user can open spam, checks this return information, and according to the need of self Ask and judge whether this return information is junk information, if it is remove it or marking, otherwise by it It is classified as normal information.
Wherein, owing to major part personal space is disclosed, a non-SPAM replied is sentenced by any user Establish a capital and all netizens can be allowed to see that this replys, and for some sensitive informations, this strategy does not obviously conform to Suitable, therefore for sensitive information, user can only submit torsion application to, can not revise this comment at once State.So, alternatively, the return information after user is carried out secondary judgement is audited again, examination & verification It is further continued for after by performing step.
Step 206: return information is grouped in normal information classification, makes to see back in the classification of normal information All users of complex information carry out secondary judgement to return information, and record the judged result of all users.
In the present embodiment, after system judges that return information is as normal information, this return information is grouped into normally In the classification of information, user can see this return information in the classification of normal information, and again enters it Row judges, the judged result of record user, in order to statistics delivers the number of unified judged result.
Step 207: statistics sees that in the classification of normal information all users of return information are to return information Judged result is the number of junk information.
Step 208: judge whether the number that the judged result to return information is junk information meets the threshold preset Value, if it is, delete this return information.
In the present embodiment, after system judges that this return information is as non-spam (normal information), by this time Complex information is classified as normal information one class, then user checks the return information being classified as normal information by system, according to Whether this return information of the demand estimation of self is junk information, if it is removes it or marking, Otherwise it is classified as normal information.
In applying at the Internet UGC, all of content is all that user submits to, and user values oneself very much Cyberspace, that has a mind to safeguard in oneself space is clean and tidy, it is undesirable to the existence of SPAM, therefore this enforcement Introducing the decision factor of user in example in SPAM judges, user can judge knot by very first time update the system Really, then SPAM is judged by system accuracy and accurate rate can directly promote.And due to user personality not With, same UGC is replied, whether different user is that the judgement of SPAM may be inconsistent to it, user Whether be the judgement of SPAM, but can't be immediately affected by if can revise certain UGC in personal space and reply The result of determination of system twists.
In the present embodiment, user at personal space it can be seen that system judge SPAM content, it is also possible to see To normal content (system thinks that content is normal).If user thinks that wherein one is judged to normal returning by system Being SPAM again, he can be directly SPAM this answer mark at personal space, and this replys can be from him Space disappear immediately, and alternatively, the selection of user can feed back to system, and other are similar to by the system that affects The judge replied, and record user's judged result to return information, according to the judged result of the user of record Adjust the weight of each characteristic item of return information.
If user thinks that wherein a reply being judged to SPAM by system is normal content, and this replys Do not comprise sensitive information, then user can be directly normal this answer mark at personal space, and this returns Resume a session and reappear in his space, and alternatively, the selection of user can feed back to system, affects system pair Other similar judges replied;If this reply comprises sensitive information, then the selection of user can not be given birth at once Effect, needs to wait for background audit and could determine whether this information can classify as normally.
Wherein, step 203-208 is that SPAM decision-making system records self judged result, makes to see reply letter All users of breath carry out secondary judgement to return information, and record the secondary judged result of all users;And Secondary judged result according to all users, is labeled as the tool of normal information or junk information by return information Body performs step.
The technical scheme that the embodiment of the present invention provides provides the benefit that: select auxiliary judgment UGC according to user The method of SPAM in content, can allow user see and the system that participates in is in the judgement of SPAM, improves The accuracy that SPAM in UGC content is judged by system and judgement speed, and user can be in personal space All replies submit oneself judgements to, have an opportunity the system to judge by accident, the content of non-SPAM is brought order out of chaos Under regular content classification, it is also possible to the wall scroll in oneself space is replied and classifies as SPAM, improve SPAM and sentence Fixed accurate rate, promotes Consumer's Experience.
Embodiment 3
Seeing Fig. 3, embodiments provide a kind of device processing information, described device includes: first Judge module 301, logging modle the 302, second judge module 303.
First judge module 301, for receiving any user return information to user-generated content, and judges Whether return information is junk information;
Logging modle 302, is used for recording judged result, makes all users seeing return information to return information Carry out secondary judgement, and record the secondary judged result of all users;
Second judge module 303, according to the secondary judged result of all users, is labeled as return information normally Information or junk information.
Wherein, see Fig. 4, the first judge module 301, including:
Extraction unit 301a, for extracting each characteristic item of return information;
Computing unit 301b, for each characteristic item weighted sum, obtains whether return information is junk information Probit;
Judging unit 301c, if the probit obtained for computing unit is more than pre-set threshold value, then replys letter Breath is junk information.
Wherein, logging modle 302, specifically for:
If the judged result of the first judge module is yes, then return information is grouped in the classification of junk information, Make all users seeing return information in the classification of junk information that return information is carried out secondary judgement, and Record the judged result of all users;
Correspondingly, the second judge module, specifically for:
Statistics sees all users of the return information judged result to return information in the classification of junk information Number for normal information, it is judged that whether the number that judged result is normal information of return information is met pre- If threshold value;If it is, return information is labeled as normal information.
Seeing Fig. 4, alternatively, device also includes:
3rd judge module 304, before being labeled as normal information by return information, it is judged that in return information Whether there is sensitive information;If it is, return information is audited again, audit by after be further continued for holding Row step.
Seeing Fig. 4, alternatively, device also includes:
First adjusting module 305, after return information is labeled as normal information, reply letter seen in record Junk information is judged as the result of normal information by the user of breath;The judged result of the user according to record adjusts The weight of each characteristic item of return information.
See Fig. 4, alternatively, logging modle 302, specifically for:
If the judged result of the first judge module is no, then return information is grouped in the classification of normal information, Make all users seeing return information in the classification of normal information that return information is carried out secondary judgement, and Record the judged result of all users;
Correspondingly, the second judge module 303, specifically for:
Statistics sees all users of the return information judged result to return information in the classification of normal information For the number of junk information, and judge whether the number that the judged result to return information is junk information meets The threshold value preset;If it is, return information is labeled as junk information.
Seeing Fig. 4, alternatively, device also includes:
Second adjusting module 306, after return information is labeled as junk information, reply letter seen in record Normal information is judged as the result of junk information by the user of breath;The judged result of the user according to record adjusts The weight of each characteristic item of return information.
The technical scheme that the embodiment of the present invention provides, selects SPAM in auxiliary judgment UGC content according to user Method, user can be allowed to see and the system that participates in is in the judgement of SPAM, improve system in UGC Accuracy that in appearance, SPAM judges and judge speed, and all replies in personal space can carry by user Hand over the judgement of oneself, have an opportunity the system to judge by accident, the content of non-SPAM brings order out of chaos regular content classification Under, it is also possible to the wall scroll in oneself space is replied and classifies as SPAM, improve the accurate rate that SPAM judges, Promote Consumer's Experience.
The device that the present embodiment provides, the most permissible, belong to same design with embodiment of the method, it is specifically real Existing process refers to embodiment of the method, repeats no more here.
The technique scheme that the embodiment of the present invention provides completely or partially can be relevant by programmed instruction Hardware completes, and described program can be stored in the storage medium that can read, and this storage medium includes: ROM, The various media that can store program code such as RAM, magnetic disc or CD.
The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all the present invention's Within spirit and principle, any modification, equivalent substitution and improvement etc. made, should be included in the present invention's Within protection domain.

Claims (10)

1. the method processing information, it is characterised in that described method includes:
Receive the return information of any user user-generated content to delivering on the internet, it is judged that described time Whether complex information is junk information;
Record described judged result, make all users seeing described return information that described return information to be carried out Secondary judges, and records the secondary judged result of described all users;
According to the secondary judged result of described all users, described return information is labeled as normal information or Junk information;
Described record described judged result, make all users seeing described return information to described return information Carry out secondary judgement, and record the secondary judged result of described all users, including:
If described judged result is yes, then described return information is grouped in the classification of junk information, makes The classification of described junk information being seen, all users of described return information carry out secondary to described return information Judge, and record the judged result of described all users;
Correspondingly, the described secondary judged result according to described all users, described return information is labeled as Normal information, including:
Statistics sees that in the classification of described junk information all users of described return information reply letter to described The number that judged result is normal information of breath, it is judged that the judged result to described return information is normal information Number whether meet the threshold value preset;
If it is, described return information is labeled as normal information;
Described described return information is labeled as normal information, the most also includes:
Judge whether described return information has sensitive information;
If it is, described return information is audited again, audit by after be further continued for perform step;
Described method also includes:
When delivering user, the described return information being labeled as normal information of personal space is labeled as junk information Time, do not show described return information at described personal space;
It is labeled as normally believing by the described return information being labeled as junk information of personal space when delivering user Breath, and when described return information does not has sensitive information, show described return information at described personal space;
It is labeled as normally believing by the described return information being labeled as junk information of personal space when delivering user Breath, and when described return information has sensitive information, described return information is audited again, examination & verification is logical Later described return information is shown at described personal space.
Method the most according to claim 1, it is characterised in that described whether judge described return information For junk information, including:
Extract each characteristic item of described return information;
To described each characteristic item weighted sum, obtain the probit whether described return information is junk information;
If the described probit obtained is more than pre-set threshold value, the most described return information is junk information.
Method the most according to claim 1, it is characterised in that described described return information is labeled as Normal information, the most also includes:
Record sees that described junk information is judged as the result of normal information by the user of described return information;
The judged result of the described user according to described record adjusts the weight of each characteristic item of return information.
Method the most according to claim 1, it is characterised in that described record described judged result, makes See that all users of described return information carry out secondary judgement to described return information, and record described all The secondary judged result of user, including:
If described judged result is no, then described return information is grouped in the classification of normal information, makes The classification of described normal information being seen, all users of described return information carry out secondary to described return information Judge, and record the judged result of described all users;
Correspondingly, the described secondary judged result according to described all users, described return information is labeled as Junk information, including:
Statistics sees that in the classification of described normal information all users of described return information reply letter to described The number that judged result is junk information of breath, and judge to be rubbish letter to the judged result of described return information Whether the number of breath meets the threshold value preset;
If it is, described return information is labeled as junk information.
Method the most according to claim 4, it is characterised in that described described return information is labeled as Junk information, the most also includes:
Record sees that described normal information is judged as the result of junk information by the user of described return information;
The judged result of the described user according to described record adjusts the weight of each characteristic item of return information.
6. the device processing information, it is characterised in that described device includes:
First judge module, for receiving returning of any user user-generated content to delivering on the internet Complex information, and judge whether described return information is junk information;
Logging modle, is used for recording described judged result, makes all users seeing described return information to institute State return information and carry out secondary judgement, and record the secondary judged result of described all users;
Second judge module, for the secondary judged result according to described all users, by described return information It is labeled as normal information or junk information;
Described logging modle, specifically for:
If the judged result of described first judge module is yes, then described return information is grouped into junk information Classification in, make all users seeing described return information in the classification of described junk information to described time Complex information carries out secondary judgement, and records the judged result of described all users;
Correspondingly, described second judge module, specifically for:
Statistics sees that in the classification of described junk information all users of described return information reply letter to described The number that judged result is normal information of breath, it is judged that the judged result to described return information is normal information Number whether meet the threshold value preset;If it is, described return information is labeled as normal information;
Described device also includes:
3rd judge module, before being labeled as normal information by described return information, it is judged that described reply Whether information there is sensitive information;If it is, again audit described return information, examination & verification is passed through After be further continued for perform step;
Described second judge module is additionally operable to, when delivering user by the institute being labeled as normal information of personal space State return information when being labeled as junk information, do not show described return information at described personal space;When delivering The described return information being labeled as junk information of personal space is labeled as normal information, and described time by user When complex information does not has sensitive information, show described return information at described personal space;Will when delivering user The described return information being labeled as junk information of personal space is labeled as normal information, and described return information In when having sensitive information, described return information is audited again, audit by after at described personal space Show described return information.
Device the most according to claim 6, it is characterised in that described first judge module, including:
Extraction unit, for extracting each characteristic item of described return information;
Computing unit, for described each characteristic item weighted sum, obtains whether described return information is rubbish The probit of information;
Judging unit, if the described probit obtained for described computing unit is more than pre-set threshold value, then institute Stating return information is junk information.
Device the most according to claim 6, it is characterised in that described device also includes:
First adjusting module, after described return information is labeled as normal information, record is seen described Described junk information is judged as the result of normal information by the user of return information;According to described record The judged result of user adjusts the weight of each characteristic item of return information.
Device the most according to claim 6, it is characterised in that described dress logging modle, specifically for:
If the judged result of described first judge module is no, then described return information is grouped into normal information Classification in, make all users seeing described return information in the classification of described normal information to described time Complex information carries out secondary judgement, and records the judged result of described all users;
Correspondingly, described second judge module, specifically for:
Statistics sees that in the classification of described normal information all users of described return information reply letter to described The number that judged result is junk information of breath, and judge to be rubbish letter to the judged result of described return information Whether the number of breath meets the threshold value preset;If it is, described return information is labeled as junk information.
Device the most according to claim 9, it is characterised in that described device also includes:
Second adjusting module, after described return information is labeled as junk information, record is seen described Described normal information is judged as the result of junk information by the user of return information;According to described record The judged result of user adjusts the weight of each characteristic item of return information.
CN201110107521.3A 2011-04-27 2011-04-27 The method and apparatus of process information Active CN102760130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110107521.3A CN102760130B (en) 2011-04-27 2011-04-27 The method and apparatus of process information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110107521.3A CN102760130B (en) 2011-04-27 2011-04-27 The method and apparatus of process information

Publications (2)

Publication Number Publication Date
CN102760130A CN102760130A (en) 2012-10-31
CN102760130B true CN102760130B (en) 2016-11-16

Family

ID=47054588

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110107521.3A Active CN102760130B (en) 2011-04-27 2011-04-27 The method and apparatus of process information

Country Status (1)

Country Link
CN (1) CN102760130B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104252479B (en) * 2013-06-27 2018-05-18 华为技术有限公司 Processing method, the device and system of information
CN104753758B (en) * 2013-12-30 2019-04-26 深圳市腾讯计算机系统有限公司 A kind of information attribute recognition methods and device
CN103970832A (en) * 2014-04-01 2014-08-06 百度在线网络技术(北京)有限公司 Method and device for recognizing spam
CN111289276A (en) * 2020-01-20 2020-06-16 北京韬盛科技发展有限公司 Intelligent lifting safety detection method and device for climbing frame

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421709B1 (en) * 1997-12-22 2002-07-16 Accepted Marketing, Inc. E-mail filter and method thereof
CN1809821A (en) * 2003-03-03 2006-07-26 微软公司 Feedback loop for spam prevention

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6421709B1 (en) * 1997-12-22 2002-07-16 Accepted Marketing, Inc. E-mail filter and method thereof
CN1809821A (en) * 2003-03-03 2006-07-26 微软公司 Feedback loop for spam prevention

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种基于用户行为分析的协同反垃圾邮件策略;陈建发等;《电脑知识与技术》;20070731;第37页 *
基于用户反馈和增量学习的垃圾邮件识别方;王鑫等;《清华大学学报》;20060131;第70-72页 *
基于用户反馈和增量学习的垃圾邮件识别方法;王鑫等;《清华大学学报》;20060131;第70-72页 *

Also Published As

Publication number Publication date
CN102760130A (en) 2012-10-31

Similar Documents

Publication Publication Date Title
US10360385B2 (en) Visual styles for trust categories of messages
Fawcett " In vivo" spam filtering: a challenge problem for KDD
CN106027799B (en) Terminal information prompting method and device and terminal
CN107872772B (en) Method and device for detecting fraud short messages
Bergholz et al. Improved Phishing Detection using Model-Based Features.
US8959159B2 (en) Personalized email interactions applied to global filtering
CN102760130B (en) The method and apparatus of process information
Bajaj et al. A novel user-based spam review detection
CN102801706A (en) Terminal and security processing method for information contents
Zilberman et al. Analyzing group communication for preventing data leakage via email
CN104009964A (en) Network link detection method and system
Klyuev Fake news filtering: Semantic approaches
CN107644106A (en) The internuncial method of automatic mining business, terminal device and storage medium
Khan et al. Text mining approach to detect spam in emails
Rayan Analysis of e-mail spam detection using a novel machine learning-based hybrid bagging technique
CN106302135A (en) The method of a kind of mail arrangement and terminal
CN115603926A (en) Phishing mail identification method, system, device and storage medium
Rangari et al. An Empirical Analysis of Different Techniques for Spam Detection
CN110048936B (en) Method for judging junk mail by semantic associated words
Agrawal et al. Analysis of text mining techniques over public pages of Facebook
Halim et al. Malicious users' circle detection in social network based on spatio-temporal co-occurrence
Abhila et al. Spam detection system using supervised ML
CN103389987A (en) Text similarity comparison method and system
CN110460582A (en) A kind of detection method and device of risk email address
US20230171287A1 (en) System and method for identifying a phishing email

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant