The assert statement exists in almost every programming language. When you do... assert condition you're telling the program to test that condition, and trigger an error if the condition is false. In Python, it's roughly equivalent to this: if not condition : raise AssertionError () Try it in the Python shell: assert True assert False Traceback ( most recent call last ): File stdin , line 1 , in module AssertionError Assertions can include an optional message, and you can disable them when you're done debugging.
http://www.03964.com/read/7c7f4d52e250dac51ed82edd.html A Beginner’s Guide to Materials Studio and DFT Calculations with Castep P. Hasnip (pjh503@york.ac.uk) September 18, 2007 Materials Studio collects all of its les into “Projects”. We’ll start by creating a new project. 1 Now we’ve got a blank project, and we want to dene a simulation cell to perform a Castep calculation on. First we add a “3D Atomistic document”. 2 3 We’re going to start by simulating an eight atom silicon FCC cell, so rename the le accordingly. First we’ll create the unit cell. 4 5 The default is space group P1, i.e. no symmetry. Silicon has the diamond structure (space group FD3M). By telling Materials Studio this symmetry it will automatically apply it to the atoms, thus generating atoms at the symmetry points. 6 Now to add the lattice constant – click on the “Lattice” tab near the top of the “Build Crystal” window. Since FD3M is cubic (FCC) Materials Studio knows only a has to be set, and the angles and other lattice constants are greyed-out. Enter “5.4”, and then click on “Build”. 7 Now we’ll add a single silicon atom... 8 Add a silicon atom at the origin, by changing the “element” from its default and clicking “Add”. By default the co-ordinates are in fractionals, but you can change this on the “Options” tab. 9 Since we’d already told Materials Studio what the symmetry of the crystal was, our single silicon atom is replicated at each symmetry site and we now have a shiny new eight-atom silicon unit cell. You can rotate the view by holding down the left or right mouse button and dragging, or move it by holding down the middle button. Use the mouse wheel, or both the left and right buttons simultaneously, to zoom in and out. 10 By default the atoms are shown as little crosses with lines for bonds, and silicon atoms are coloured brownish orange. You can always change this if you don’t like it. The “bonds” are just guesses made by Materials Studio based on the element’s typical bond-lengths. We’re now ready to run Castep to nd the groundstate charge density. Click on the Castep icon, which is a set of three wavy lines (to represent plane-waves), and select “Calculation”. 11 Materials Studio oers a high-level interface to Castep, with cut-o energy, k-point sampling, convergence tolerances etc. all set by the single setting “Quality”. We’ll look at how to specify these things later, but for now we’ll just do a very quick, rough calculation of the groundstate energy and density of our cell. Make sure the task is “Energy”, and select “Coarse” for quality, and “LDA” for the XC functional. 12 If you want to run your calculations on Lagavulin (or anywhere else you have Castep available) then you’ll want to click “Files” in the Castep window. Simply select “Save Files” to save the cell and param les. By default these are written to a folder in “My Documents” called “Materials Studio Projects” but be warned – cell les are hidden les, and you won’t be able to see them unless you tell Windows you want to view “Hidden and System Files” for that folder. 13 If you want to run Castep on the PC you’re using, you just need to click “Run” on the Castep window. You should see this window appear: Materials Studio is telling you that your system isn’t actually the primitive unit cell, and it’s oering to convert it to the primitive cell for you. For now choose “No”. Castep runs via a “Gateway”, which might be on your local computer or on a remote machine. This Gateway handles Materials Studio’s requests to run calculations and copies the les to and from the “Castep Server”. 14 Since the Gateway is actually a modied web server it is sensible to enforce some security measures. If your Gateway is password-protected (recommended), you’ll need to enter your Gateway username and password (which are not necessarily the same as your Windows ones). 15 When the Castep job is running you will see its job ID and other details appear in the “job explorer” window. You can check its status from here, although our crude silicon calculation is so quick you probably won’t have time now. 16 Castep reports back when it is nished, and Materials Studio copies the results of the calculation back. The .castep le is opened automatically so you can see what happened in the calculation. The main text output le from castep is displayed in Materials Studio. It starts with a welcome banner, then a summary of the parameters and cell that were used for the calculation. 17 After that, there is a summary of the electronic energy minimisation which shows the iterations Castep performed trying to nd the groundstate density that was consistent with the Kohn-Sham potential. This is the so-called “self-consistent eld” or “SCF” condition, and each line is tagged with “– SCF” so you can nd them easily. ------------------------------------------------------------------------ -- SCF SCF loop Energy Fermi Energy gain Timer -- SCF energy per atom (sec) -- SCF ------------------------------------------------------------------------ -- SCF Initial 2.11973065E+002 4.85767974E+001 0.61 -- SCF Warning: There are no empty bands for at least one kpoint and spin; this may slow the convergence and/or lead to an inaccurate groundstate. If this warning persists, you should consider increasing nextra_bands and/or reducing smearing_width in the param file. Recommend using nextra_bands of 7 to 15. 1 -7.22277610E+002 1.02240172E+001 1.16781334E+002 0.88 2 -8.53739673E+002 6.90687627E+000 1.64327579E+001 1.12 3 -8.62681938E+002 6.65069587E+000 1.11778315E+000 1.39 4 -8.62169156E+002 6.69758744E+000 -6.40977798E-002 1.72 5 -8.61880601E+002 6.78641872E+000 -3.60693332E-002 2.06 6 -8.61884687E+002 6.79549194E+000 5.10791707E-004 2.44 7 -8.61884645E+002 6.79874201E+000 -5.25062118E-006 2.75 8 -8.61884639E+002 6.79822409E+000 -8.40318139E-007 2.98 -----------------------------------------------------------------------Final energy, E = -861.8846385210 Final free energy (E-TS) = -861.8846385210 (energies not corrected for finite basis set) NB est. 0K energy (E-0.5TS) = -861.8846385210 eV eV ---------SCF SCF SCF SCF SCF SCF SCF SCF SCF eV 18 We’ll look at this output in more detail later. For now just note that the energy converges fairly rapidly to about 861.88eV, but that the energy is sometimes higher than this and sometimes lower. Let’s have a look at the calculated groundstate charge density. 19 The Castep Analysis window lets you look at various properties you might have calculated during the Castep job. Select “Electron density”. Notice there’s a “Save” button which lets you write the density out to a text le so you can analyse it with another program. We don’t need this now, so just click on “Import”. 20 WARNING: amongst the properties listed here are “Band structure” and “Density of states”. If you select one of these from an energy calculation, Materials Studio will plot the band structure/DOS, but it takes the eigenvalues and k-points from the SCF calculation, not a proper band structure or DOS calculation. 21 By default an isosurface of the charge density is overlaid on your simulation cell. 22 To change the isosurface Materials Studio is plotting, you need to change the “Display style”. Either use the right mouse button when the cursor is over the simulation cell, or use the drop-down menus: Notice that this is also the place you need to come to if you want to change the atom colouring or representation (e.g. from crosses and lines to ball-and-stick). 23 24 Try changing the value of the isosurface your plotting, to see where the charge density is greatest and least. 25 Hopefully you’ve now got the hang of the basic interface. Go back to your simulation system and open up the Castep window again. This time select the “Electronic” tab. 26 This tab has a little more detail, and actually tells you what cut-o energy and k-point grid Castep will use for the given settings. Nevertheless we usually want ner control than this, so click on “More”. 27 Now at last we have four tabs that let us set some of the convergence parameters directly. 28 Basis Allows you to set a cut-o energy, as well as control the nite basis set correction. SCF Sets the convergence tolerance for the groundstate electronic energy minimisation, as well as details of the algorithm used. k-points Controls the Brillouin zone sampling directly. You can either specify a grid, or a desired separation between k-points. Potentials Allows you to change the pseudopotentials used for the elements in your system. In fact if you double-click on your param le in the project window you can edit it directly, but we’ll restrict ourselves to using the GUI for now. 29 Before we continue, here’s a quick recap of the basic approximations we use when performing practical DFT calculations: Exchange-correlation (XC) Functional - we don’t know the exact density functional, so we have to approximate it. There are two common approximations: – LDA - the Local Density Approximation assumes the XC at any point is the same as that of a homogeneous electron gas with the same density. – PBE - this is a “Generalised Gradient Approximation” (GGA) and includes some of the eects of the gradient of the density. You might think PBE is always better than LDA, but that’s not true, both are approximations. You should try each one before deciding which is appropriate to your research project. Basis set - the wavefunction is represented by an expansion in a plane-wave basis. In theory the basis set required is innite, but since the energy converges rapidly with basis set size we can safely truncate the expansion. The size of the basis set is controlled by the cut-o energy. Brillouin zone sampling - calculating the energy terms requires us to integrate quantities over the whole of the rst Brillouin zone. In practice we approximate these integrals by sums over a discrete set of k-points. 30 Exercise 1. Using the Basis and k-points tabs, investigate how the calculated energy of the simulation cell converges with increased cut-o energy, and increased k-point sampling density. Why do they show these trends Exercise 2. Create a unit cell for bulk aluminium. Aluminium is also FCC, with spacegroup FM-3M and a lattice constant of about 4.05 . Investigate convergence of the calcuA lated aluminium energy with respect to cut-o energy and k-point sampling. Compare the total electronic energy with the total electronic free energy for both silicon and aluminium. Why do they dier for one and not the other 31 During your calculations you might see a warning like this in the castep output: Warning: There are no empty bands for at least one kpoint and spin; this may slow the convergence and/or lead to an inaccurate groundstate. If this warning persists, you should consider increasing nextra_bands and/or reducing smearing_width in the param file. Recommend using nextra_bands of 7 to 15. Recall that the electronic energy minimisation algorithms need to include the entire set of occupied states. If the highest state you’ve included in the calculation is occupied, Castep has no way of knowing whether the next state should also have been occupied, and so recommends you include more bands. Only when the highest state is unoccupied can Castep be sure that all of the occupied bands have been included. You can change the number of “empty” bands included in the Castep calculation from the SCF tab of the Castep Electronic Options window of Materials Studio, or just by editing the param le directly. 32 Exercise 3. Repeat the energy convergence test with respect to k-point sampling for aluminium, but using a smearing of 0.5eV (see the SCF tab; the default is 0.1eV). Feel free to use either Materials Studio, or direct editing of the param and cell les. You will probably need to increase the number of empty bands to 8 or so. Compare the results with the previous aluminium calculations. Why the dierence Choose a particular k-point sampling density and look at the nal total energy, free energy, and estimated zero temperature energy for the 0.5 eV smearing and compare them to the results with the original smearing. 33 Exercise 4. Go back to your silicon calculation, and look at the SCF tab on the CASTEP Electronic Options window. We’re using the “Density Mixing” algorithm, and if you click on “More” you’ll see we’re using a Pulay mixing scheme with a charge mixing amplitude of 0.5. Investigate what happens as you vary this initial amplitude from close to 0 to close to 1. The Pulay algorithm takes over after the rst few SCF cycles, and overrides the mixing charge amplitude. This is not true of the Kerker scheme. Use the “More” button and change the mixing scheme to Kerker, and investigate the eects of the mixing charge amplitude again. Exercise 5. Have a play with the Castep interface and Castep. Why don’t you see whether you can get Castep to fail to converge Remember what causes density mixing to be unstable: metals, degeneracies (band-crossings), multiple spin states, long cells, small smearing 34 widths etc. The only restriction is computational time, so if you make a large cell try not to have too many atoms in it or Castep won’t nish in time! If you manage to make Castep fail to converge, try to x it by varying the DM parameters. If that doesn’t work, how does EDFT do Remember you can always save your cell and param les and copy them to Lagavulin if your PC isn’t fast enough. 35 You can also try Castep on your favourite system. Things you might nd useful: Materials Studio ships with lots of sample structures, just click “File” then “Import” and have a look, or create your own. To create a supercell from a unit cell, click on the “Build” menu, then select “Symmetry” and then “Supercell”. To modify atoms just left-click (or dragselect) to select them, and then you can use the “Modify” menu to change their element. Materials Studio has a useful surface builder so you can cleave crystals along bizarre planes without too much eort. 36 If you’re stuck for things to do: Try making a supercell of two aluminium FCC cells, and swapping one of the aluminium atoms for erbium. Run that, and see what happens. Can you improve it Use the task “properties” in the Castep window to calculate the DOS and band structures of silicon and aluminium. Now create a simple molecule surrounded by vacuum, and calculate its band structure and DOS. Do you get what you expect Calculate the binding energy of a simple molecule. Run Castep for the molecule, and then again for a single atom of each of the elements in turn. Subtract the energies, and see what you get. How does the result change if you change the cut-o energy for (a) one calculation; (b) all the calculations 37
google 搜索:lammps mail list, comb (function(w, d, g, J) { var e = J.stringify || J.encode; d = d || {}; d = d || function() { w.postMessage(e({'msg': {'g': g, 'm':'s'}}), location.href); } })(window, document, '__huaban', JSON); http://lammps.sandia.gov/threads/msg26852.html http://lammps.sandia.gov/threads/msg26832.html http://lammps.sandia.gov/threads/msg28484.html
http://scienceblog.com/ Scientists document fragile land-sea ecological chain Technology convergence may widen the digital divide Sound Waves as Effective as Brain Surgery at Treating Essential Tremor How flowers do it Breast Cancer Effectively Treated with Chemical Found in Celery Impact of MRSA nasal colonization on surgical site infections after gastrointestinal surgery Drug found for parasite that is major cause of death worldwide Experimental bariatric surgery controls blood sugar in rats with diabetes Good news for nanomedicine: Quantum dots appear safe in pioneering study on primates
GBrowse and GFF The purpose of this document is to explore how the tables in the mysql database used by gbrowse relate to the GFF from which they are populated. The conclusions summarize how I currently think it all works, and are most likely to be of interest to others. Methods Results Table Structure Fate of typical GFF fields Population of fattribute and fattribute_to_feature tables Alignments Conclusions Methods I will be using a system set up as described in my previous document . In addition to the work described on that page, I have additionally loaded in wormbase release 130. It is that dataset that I will be exploring in the following examples. Results Table Structure The table structure of GBrowse is as follows: Fate of typical GFF fields First, I will explore where exactly the information for a particular line of gff ends up. Here is an example line from the ws130 gff file: I Genefinder CDS 252119 253587 . + . CDS "Y48G1BM.gc6" or, split into traditional GFF fields by tabs, example line from ws130 GFF reference sequence I source Genefinder type CDS start position 252119 end position 253587 score . strand + phase . group CDS "Y48G1BM.gc6" I will now go through the seven tables in GBrowse depicted above to determine the fate of this information. fdata Table fdata (1 row) fref fstart fstop fbin ftypeid fscore fstrand fphase gid ftarget_start ftarget_stop I 252119 253587 10000.000025 6 null + null 121 null null !-- fdata (1 row) fref I fstart 252119 fstop 253587 fbin 10000.000025 ftypeid 6 fscore null fstrand + fphase null gid 121 ftarget_start null ftarget_stop null -- ftype Table ftype (1 row) ftypeid fmethod fsource 6 CDS Genefinder fgroup Table fgroup (1 row) gid gclass gname 121 CDS Y48G1BM.gc6 fdna Table fdna contained sequence as referenced by the fref field of fdata . fmeta There were no relevant rows in fmeta fattribute_to_feature There were no relevant rows in fattribute_to_feature fattribute There were no relevant rows in fattribute Comparison with the Gbrowse Tutorial reveals that the ftype table is used to determine which track to display this GFF span in, and the fgroup table is used when the user searches for a feature. This all makes intuitive sense. The fields that are null in fdata may pose a problem. Two of the field that are null, fscore and fphase , are analogous to the similarly named features in the GFF file. There are at least two unresolved issues: How can fattribute get populated? How can ftarget_start and ftarget_end get populated? Population of fattribute and fattribute_to_feature tables Consider the fate of the following line of GFF, I Coding_transcript intron 11690 14950 . + . Transcript "Y74C9A.2.4" ; Confirmed_EST yk1139h01.3 I Coding_transcript intron 11690 14950 . + . Transcript "Y74C9A.2.3" ; Confirmed_EST yk1139h01.3 I Coding_transcript intron 11690 14950 . + . Transcript "Y74C9A.2.2" ; Confirmed_EST yk1139h01.3 I curated intron 11690 14950 . + . CDS "Y74C9A.2" ; Confirmed_EST yk1139h01.3 I Coding_transcript intron 11690 14950 . + . Transcript "Y74C9A.2.1" ; Confirmed_EST yk1139h01.3 I Genefinder intron 11690 14950 . + . CDS "Y74C9A.gc2" ; Confirmed_EST yk1139h01.3 or fattribute example line from ws130 GFF reference sequence I source Coding_transcript type intron start position 11690 end position 14950 score . strand + phase . group Transcript "Y74C9A.2.4" ; Confirmed_EST yk1139h01.3 Yields the following table contents: fdata Table fdata (6 rows) fid fref fstart fstop fbin ftypeid fscore fstrand fphase gid ftarget_start ftarget_stop 9406655 I 11690 14950 10000.000001 20 null + null 8 null null 9406656 I 11690 14950 10000.000001 20 null + null 6 null null 9406657 I 11690 14950 10000.000001 20 null + null 9 null null 9406658 I 11690 14950 10000.000001 21 null + null 10 null null 9406659 I 11690 14950 10000.000001 20 null + null 7 null null 9406660 I 11690 14950 10000.000001 22 null + null 11 null null ftype Table ftype (3 rows) ftypeid fmethod fsource 20 intron Coding_transcript 21 intron curated 22 intron Genefinder fgroup Table fgroup (6 rows) gid gclass gname 6 Transcript Y74C9A.2.3 7 Transcript Y74C9A.2.1 8 Transcript Y74C9A.2.4 9 Transcript Y74C9A.2.2 10 CDS Y74C9A.2 11 CDS Y74C9A.gc2 fdna Table fdna contained sequence as referenced by the fref field of fdata . fmeta There were no relevant rows in fmeta fattribute_to_feature Table fattribute_to_feature (6 rows) fid fattribute_id fattribute_value 9406655 2 yk1139h01.3 9406656 2 yk1139h01.3 9406657 2 yk1139h01.3 9406658 2 yk1139h01.3 9406659 2 yk1139h01.3 9406660 2 yk1139h01.3 fattribute Table fattribute (1 row) fattribute_id fattribute_name 2 Confirmed_EST It should be mentioned that in searching for yk1139h01.3 in fattribute_to_feature, I found the following: fid fattribute_id fattribute_value 12236 2 yk1139h01.3 12237 2 yk1139h01.3 12238 2 yk1139h01.3 12239 2 yk1139h01.3 12240 2 yk1139h01.3 12241 2 yk1139h01.3 However, there are no enties in fdata that correspond to those fid's, and thus they are in principle useless. I am not entirely sure where these rows are coming from, but I note that when "grepping" the ws130 GFF files for yk1139h01.3, all the entires appeared to be duplicated. It is possible that here is a ton of redundancy in my ws130 database because of the change in names from CHROMOSOME_I to I that occured recently. These naming issues appear to be resolved by the bp_process_wormbase.pl, but the duplicate lines are not collapsed. Perhaps it would be worth running a second-pass script after bp_process_wormbase.pl that removed redundant lines; this is a non-trivial propspect because of the enormous size of the complete GFF for wormbase. Alignments The following lines of GFF likely give rise to alignment entries in the GBrowse tables. I BLAT_EST_BEST EST_match 11539 11561 99.7 - . Target "Sequence:yk1139h01.3" 658 636 I BLAT_EST_BEST EST_match 11618 11632 99.7 - . Target "Sequence:yk1139h01.3" 635 621 I BLAT_EST_BEST EST_match 11633 11689 99.7 - . Target "Sequence:yk1139h01.3" 619 563 I BLAT_EST_BEST EST_match 14951 15160 99.7 - . Target "Sequence:yk1139h01.3" 562 353 I BLAT_EST_BEST EST_match 16473 16781 99.7 - . Target "Sequence:yk1139h01.3" 352 44 I BLAT_EST_BEST EST_match 16783 16800 99.7 - . Target "Sequence:yk1139h01.3" 43 26 I BLAT_EST_BEST EST_match 16802 16817 99.7 - . Target "Sequence:yk1139h01.3" 25 10 I BLAT_EST_BEST EST_match 16820 16827 99.7 - . Target "Sequence:yk1139h01.3" 8 1 Searching fdata for just the start and stop of the first one, I get fdata fid fref fstart fstop fbin ftypeid fscore fstrand fphase gid ftarget_start ftarget_stop 9627963 I 11539 11561 1000.000011 55 99.7 - 12476 636 658 fgroup gid gclass gname 12476 Sequence yk1139h01.3 ftype ftypeid fmethod fsource 55 EST_match BLAT_EST_BEST fattribute_to_feature There were no rows in fattribute_to_feature with fid 9627963 This doesn't make a whole lot of sense... I don't understand where those other entries in fattribute_to_feature are coming from or what role they serve. Conclusions Fate of GFF lines The typical line of GFF results in entries into the fdata , ftype and fgroup tables. The fdata table holds most of the information from the GFF file, except for the contents of the source , type and group fields. The source and type fields form a unique pair in the ftype table, and are referenced by the ftypeid in the fdata table. The first semicolon-separated pair of terms in the group table is placed as a unique pair in the fgroup table, and referenced by the fgroupid in the fdata table. Special case : Alignments If the line of GFF represents an alignment, the group field will have a special structure similar to Target "Sequence:yk1139h01.3" 658 636 The Target group class is recognized by bp_load_gff.pl as signifying an alignment, and the next token is split on a colon to generate the real group class and name. The two tokens after that are taken as the start and stop of the alginment on the target sequence. I am not sure if the class of the target must always be sequence, but it would make sense. Special case : Additional attributes If there are one or more semicolon in the group field, such as Transcript "Y74C9A.2.4" ; Confirmed_EST yk1139h01.3 it is split and the first pair of terms is used as the group class and name, and the later pairs of terms form attribute name and values. Perhaps for performance reasons, instead of storing both the name and value in the fattribute_to_feature table, the attribute name is stored in the fattribute table and referenced by the fattribute_id . Enduring Mysteries The only thing I haven't been able to figure out about the GBrowse tables is where the seemingly useless entries in the fattribute_to_feature table come from, and for what they could concievably be used. My best explanation is that they are an artifact cause by the repetition of the GFF in the bp_process_wormbase script output. This web page was written by Alok Saldanha ( alok at caltech dot edu ).
A range of policy options are available for driving green growth. This document outlines these options and summarises many of the issues that need to be taken into account when embarking on a green growth strategy. Diagnose key constraints to green growth As discussed in Towards Green Growth, there are a range of constraints which can prevent the emergence of greener growth. These will vary from country to country and depending on particular environmental issues at stake. Figure 1 develops a diagnostic framework for identifying key constraints to greening growth. It characterises constraints to green growth as factors which limit returns to “green” investment and innovation i.e. those activities which can foster economic growth and development while ensuring that natural assets continue to provide the resources and environmental services on which our well-being relies. These constraints are divided into two categories: The first is low overall economic returns, encapsulating factors which create inertia in economic systems (i.e. fundamental barriers to change and innovation) and capacity constraints, or “low social returns”. The second is low appropriability of returns. This is where market and government failures prevent people from capturing the full value of improved environmental outcomes and efficiency of resource use. Examples include fossil fuel subsidies (government failure) or a lack of incentives for constructing energy efficient buildings (split incentives) or reducing air pollution (negative externalities). Low economic returns which are a function of inertia constrain the expansion of new or innovative production techniques, technologies and patterns of consumption. These constraints to green innovation are a mixture of market failure and market imperfection. Low returns to RD are a market failure. Network effects (e.g. barriers to entry that arise from increasing returns to scale in networks) and the bias in the market towards existing technologies are examples of market imperfection. The exception to this is that government failure can arise from attempts to deal with these market failures (e.g. regulatory barriers to competition and government monopolies in network industries). “Low social returns” implies the absence of enabling conditions for increasing returns to low environmental impact activities. These constraints reduce the choices of consumers and producers to pursue “green” activities. For example, inadequate electricity or water sanitation infrastructure may lead to water pollution or the use of high emission fuels or inefficient production of electricity. They can also include insufficient human capital such that people are not aware of alternative sources of energy or there is insufficient technical know-how to deploy them. In addition, at low levels of development, a mixture of poor infrastructure with low human capital and institutional quality can mean heavy reliance on natural resource extraction and little incentive for improved natural resource use like sustainable forestmanagement. These constraints reflect a mixture of government failure, market failures and market imperfections. 原文见 http://www.oecd.org/dataoecd/32/48/48012326.pdf
R functions for “Regression-type estimation of the parameters of symmetric -stable laws’ 1 Introduction Type: Regression-type estimation of the parameters of symmetric -stable laws Version: 1.0 Date: 2011-11-23 Author: Shibin Zhang Maintainer: Shibin Zhang sbzhang@shmtu.edu.cn Description: This document provides R functions for Regression-type estima- tion of the parameters of symmetric -stable laws. Usage: To use the software, you will need to download the file regressiontypeestsyS.R into a suitable directory on your computer. This contains the functions listed below and various supporting functions. You should not need to look at the R code in this file unless you want to see the details of what’s going on. 2 The functions regressiontypestable(x,rp=1,error=10) The function regressiontypeestsyS is used to estimate parameters and . It employs the method in Koutrouvelis (1980) and Akgiray and Lamoureux (1989). In the function regressiontypeestsyS, x is the sample. rp is the maximal recursive times. error is also used to control the recursive times but it means the maximal di erence between two consecutive estimate. See Koutrouvelis (1980) and Akgiray and Lamoureux (1989) for more details. References V. Akgiray, C.G. Lamoureux, “Estimation of stable-law parameters: a comparative study,” J. Bus. Econ. Statist., vol. 7, no. 1, pp. 85-93, 1989. I. Koutrouvelis, “Regression-type estimation of the parameters of stable laws,” J. Amer. Statist. Assoc., vol. 75, pp. 918-928, 1980. regressiontypeestsyS.R
1)在SWP安装目录下的\Shells\Standard LaTeX目录中编辑文件Blank - Standard LaTeX Article.shl,在\begin{document}之前加入如下几行: \RequirePackage{CJK} \AtBeginDocument{\begin{CJK*}{GBK}{song}\CJKtilde} \AtEndDocument{\end{CJK*}} 然后另存为Blank - Standard CJK-LaTeX Article.shl; 2)在SWP中typeset菜单中的Expert Setting栏目下选择DVI Format Setting为MikTeX LaTeX(根据CTEX套装的安装目录指向相应的latex.exe文件,如果你的Ctex安装在c:,默认为C:/CTeX/texmf/miktex/bin/latex.exe),DVI Preview Setting指向MikTeX Yap, DVI Print Setting指向dvips; 在swp5.5中并不需要更改latex2.dat,只需要做简单但关键的一步,那就是在上述2)时将其中的charater下拉列表中的normal改为Simplified Chinese. 最后还有一点要注意,保存tex文件时一定要用save as 存为Portable Latex, character set 要选为Simplified Chinese.
我成功的第一个latex,激动!激动!激动! \documentclass{article} \usepackage{amsmath} \begin{document} \title{my paper} \maketitle \tableofcontents \section{WinEdt} Game Theoretic Analysis of Voting in Committees, Cambridge University Press \end{document}
维基百科,自由的百科全书 TF-IDF (term frequency–inverse document frequency)是一种用于 资讯检索 与 文本挖掘 的常用加权技术。TF-IDF是一种统计方法,用以评估一字词对于一个文件集或一个 语料库 中的其中一份 文件 的重要程度。字词的重要性随着它在文件中出现的次数成 正比 增加,但同时会随着它在语料库中出现的频率成反比下降。TF-IDF加权的各种形式常被 搜索引擎 应用,作为文件与用户查询之间相关程度的度量或评级。除了TF-IDF以外,互联网上的搜寻引擎还会使用基于连结分析的评级方法,以确定文件在搜寻结果中出现的顺序。 目录 1 原理 2 例子 3 在向量空间模型里的应用 4 参考资料 5 外部链接 原理 在一份给定的文件里, 词频 (term frequency,TF)指的是某一个给定的词语在该文件中出现的次数。这个数字通常会被正规化,以防止它偏向长的文件。(同一个词语在长文件里可能会比短文件有更高的词频,而不管该词语重要与否。)对于在某一特定文件里的词语 t i 来说,它的重要性可表示为: 以上式子中 n i , j 是该词在文件 d j 中的出现次数,而分母则是在文件 d j 中所有字词的出现次数之和。 逆向文件频率 (inverse document frequency,IDF)是一个词语普遍重要性的度量。某一特定词语的IDF,可以由总文件数目除以包含该词语之文件的数目,再将得到的商取 对数 得到: 其中 |D|:语料库中的文件总数 :包含词语 t i 的文件数目(即 的文件数目)如果该词语不在语料库中,就会导致被除数为零,因此一般情况下使用 然后 某一特定文件内的高词语频率,以及该词语在整个文件集合中的低文件频率,可以产生出高权重的TF-IDF。因此,TF-IDF倾向于过滤掉常见的词语,保留重要的词语。 例子 有很多不同的 数学公式 可以用来 计算 TF-IDF。这边的例子以上述的数学公式来计算。词频 (TF) 是一词语出现的次数除以该文件的总词语数。假如一篇文件的总词语数是100个,而词语“母牛”出现了3次,那么“母牛”一词在该文件中的词频就是3/100=0.03。一个计算文件频率 (DF) 的方法是测定有多少份文件出现过“母牛”一词,然后除以文件集里包含的文件总数。所以,如果“母牛”一词在1,000份文件出现过,而文件总数是10,000,000份的话,其逆向文件频率就是 ln(10,000,000 / 1,000)=4。最后的TF-IDF的分数为0.03 * 4=0.12。 在向量空间模型里的应用 TF-IDF权重计算方法经常会和 余弦相似度 (cosine similarity)一同使用于 向量空间模型 中,用以判断两份文件之间的 相似性 。 参考资料 Salton, G. and McGill, M. J. 1983 Introduction to modern information retrieval . McGraw-Hill, ISBN 0-07-054484-0 . Salton, G., Fox, E. A. and Wu, H. 1983 Extended Boolean information retrieval. Commun. ACM 26, 1022–1036. Salton, G. and Buckley, C. 1988 Term-weighting approaches in automatic text retrieval. Information Processing Management 24(5): 513–523. 外部链接 Term Weighting Approaches in Automatic Text Retrieval Robust Hyperlinking :An application of tf–idf for stable document addressability.