Announcement

Collapse
No announcement yet.

Searching Sources

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Searching Sources

    My main family file has 77,000+ individuals in it currently. I have other family files with only a couple of thousand individuals. The performance difference is noticeable with all features activated, but expected and tolerable.

    What I don't understand though is why the searching of Sources is so slow by comparison. I have 3,360 sources currently. If I bring up the Sources pane, and enter a keyword, it takes around 7 seconds for the result to appear. The hardware I use is as per my profile, so no problem in that regard. Is it feasible to rejig the coding to make searching sources quicker?


    #2
    It would be helpful to see some examples of your searches.

    For example, what are you searching for (what's in your search box), how are you searching ("starts with," or "contains"), are you using the sidebar or list window, are you limiting the search to a particular template via the Show menu, if you're in the List window are you limiting the search to a particular field, etc.
    Frank Leister
    Leister Productions Inc.

    Comment


      #3
      Originally posted by Steven View Post
      My main family file has 77,000+ individuals in it currently. I have other family files with only a couple of thousand individuals. The performance difference is noticeable with all features activated, but expected and tolerable.

      What I don't understand though is why the searching of Sources is so slow by comparison. I have 3,360 sources currently. If I bring up the Sources pane, and enter a keyword, it takes around 7 seconds for the result to appear. The hardware I use is as per my profile, so no problem in that regard. Is it feasible to rejig the coding to make searching sources quicker?
      Compared to other searchable lists, the source file is not indexed. Currently, I have 140 000 individuals and some 30 000 sources. Searching the source can take quite a while. A simple string search takes about 30 seconds. To have an indexed source file would help at lot.
      Reiner
      SauerRL@me.com ? info@reunion-de.de
      Web: http://www.schevenhuette.com
      Web: http://www.reunion-de.de

      Comment


        #4
        Originally posted by Frank View Post
        <snip>
        how are you searching ("starts with," or "contains")
        <snip>
        Ahh. thanks Frank, I'd not realised there was now a 'contains' option when searching sources. Obviously a change which has been introduced since I last realised the absence of the ability to search text strings anywhere within the source content (I exchanged emails on this subject with Mark Harrison back in January 2018). So that's that problem sorted. I'd become accustomed to adding multiple variations of how I might search something. For instance if I were to create a source for the Reunion website, I'd create multiple URL fields in order to help identify a source regardless of what string I might try to search for, for instance I'd create multiple fields to record:

        https://www.leisterpro.com
        http://www.leisterpro.com
        www.leisterpro.com
        leisterpro.com
        leisterpro

        Originally posted by Frank View Post
        <snip>
        are you using the sidebar or list window, are you limiting the search to a particular template via the Show menu, if you're in the List window are you limiting the search to a particular field, etc.
        <snip>
        I generally use the sidebar, and usually without limiting the search to a particular template. However I've just tested a search whilst limiting to a particular Source Template, and it did indeed speed the search up considerably. Nevertheless, in my case I'd be disinclined to filter by Source Template on a search-by-search basis due to two main reasons:

        1. I could never be sure which Source Template I used when I set up a source originally, so would prefer to search all sources in a single step, even though it takes longer.
        2. Old habits die hard.

        If Reiner's suggestion is feasible though, and speeds things up, happy days!

        Comment


          #5
          This slow source search problem often bites me in unexpected ways and introduces little speed bumps in my workflow. Almost every time I create a new source, I do so by duplicating an existing source and editing its contents. I filter my source list to show an appropriate source, click on it, and duplicate. Before I can get on with my work, I have to wait 3-4 seconds while Reunion repeats the search from scratch (even though, by definition, a duplicate source is going to be in the found set if the original was). Similarly, once I'm done editing a source, when I close the source window, I have to again wait 3-4 seconds for the search to be repeated (even though only 1 of my ~7700 sources has changed).

          In 2020, there's no reason for this kind of search to take so long, even if it is done from scratch every time. On a modern Mac (I'm running on a 2015-vintage iMac), it should be instantaneous. Mail.app can do a full-text search of my 4.5GB mail archive almost instantaneous. Music.app can find a track in my 35,000-track music library as fast as I can type its name, BBEdit can highlight every occurrence of a complex grep search in a multi-megabyte text file as I type it, but it takes Reunion 3-4 seconds to search ~1.5MB of source data.
          Brad Mohr
          https://bradandkathy.com/genealogy/

          Comment


            #6
            Because of what Brad describes, I usually work with the source list showing in the sidebar, but not filtered at all. When I want a new source the same as a previous one, I just scan the bottom of the list by eye to find one that will fit the purpose I'm looking for and duplicate it right there.

            I too have what should be sufficient horsepower - a 2012 Mac Pro with 96GB RAM, 2 x 6 core 3.06GHz CPU, and running from an SSD in a PCI slot, running Mac OS X Mojave 10.14 (up until a week or so ago it was High Sierra 10.13).

            Roger
            Roger Moffat
            http://lisaandroger.com/genealogy/
            http://genealogy.clanmoffat.org/

            Comment


              #7
              I encounter the slowness problem big time. I often work for stretches where I duplicate resources too, but there do not usually come from the bottom of the unfiltered list. To keep things manageable I therefore keep an appropriate search in the box, but this is subject to the slowdowns. I can easily understand that, in particular with a "contains" search it is not practical to keep an index to speed up (re)searches. This may be different for "starts with" as there would be many fewer possibilities.
              However, I feel that something that can always be done is to cache the list of search results and re-use it as long as the search term has not changed. You might say adding a source should invalidate the cache, and that is one way to handle it. On the other hand if a source is added it is easy to check if this source should be in the cached list, and add it there also as necessary. This should be a lot faster than adding it, invalidating the cache, and researching everything again. Ditto for a deleted or changed resource.
              Implementing such a cache would eliminate a lot of the problems experienced.
              Similarly, as a suggestion, if "starts with" is in use an in-memory "trie" data structure could be maintained, leading to a very high search speed (this cannot work for contains).

              Comment


                #8
                I simply don't encounter this slowness—but then I almost never use the search, because I can easily find the source I want by scrolling down. Duplication is then quite speedy. I guess this depends on always keeping the source list in good order, so that it's visually obvious where to scroll to.

                Comment


                  #9
                  Originally posted by Michael Talibard View Post
                  I guess this depends on always keeping the source list in good order, so that it's visually obvious where to scroll to.
                  I have over 7,700 sources, about 1,600 of which begin "Obituary of [Firstname Lastname]." I need to duplicate one from a certain newspaper. I can't imagine a way in which I could make it visually obvious where to scroll to to find one of the four of them that were published that particular newspaper. (What I'd really like is to be able to type "obituary fitchburg" in the search field and get all the sources that contain both of those words anywhere in any field...)

                  Sorting the source list by "Data" rather than source number could help, but it doesn't, because Reunion is equally glacial in sorting the source list (or re-sorting after editing a source). Again, a comparison to iTunes is apt: I can view my entire 35,000 item iTunes library in a single table view and re-sort instantaneously, even when the sort order uses data from multiple fields (which it often does: sorting "by album" actually sorts by album first, then by track number). In Reunion, even just reversing the order of an existing sort is slow.

                  Curiously, sorting appears to take about the same amount of time even when the source list is significantly abridged by searching first, which makes me think it's actually sorting ALL of the sources, not just the found set. I tried searching for an unusual word ("Scio," the name of a township in Michigan, if you're interested). I have exactly two sources that contain "Scio." Sorting the resulting list (again, a list containing two items!) takes about 3 seconds. (Three seconds to sort two items; we're truly living in the world of tomorrow!) You might think it spent that time doing something useful, right? Surely it's got a fully-sorted list of all the sources tucked away. But, no. Removing the word from the search field causes an even longer delay (6 seconds, of which the final 2 are with the dreaded "spinning rainbow cursor"). Strange.
                  Last edited by bmohr; 06 November 2020, 07:37 AM. Reason: Switched a word to clarify
                  Brad Mohr
                  https://bradandkathy.com/genealogy/

                  Comment


                    #10
                    Brad, you say "…1,600 of which begin "Obituary of [Firstname Lastname]…" Sorry, I know this is unhelpful for you (not my intention) but I must say I would never have entered them like that. For anyone following this discussion who is not so far down the road—still deciding how to handle things—I urge you to choose for each source template a sequence of data that makes them searchable visually. For obituaries, I start with the newspaper, then the date; for a census page, its date first, then country; for a parish record, it begins country, area, town. These were chosen to make a clear visual pattern, from which fresh details stand out (samples attached).

                    So I hardly ever need the search facility. This principle applies not just within Reunion, but all across my computer. If I leave things in the right place, I can find them more easily by looking there than by typing something into Spotlight Search. It's like looking for socks in the sock drawer.

                    Sample 1.jpgSample 2.jpg
                    Last edited by Michael Talibard; 07 November 2020, 03:22 AM.

                    Comment


                      #11
                      Whoa, did something get tweaked with latest Reunion maintenance update (build 201110)? Because my source searching just improved about a zillion per cent.

                      If so, thank you Leister Productions for listening to customers again!! Just one reason I stick with Reunion.

                      Comment


                        #12
                        Originally posted by Steven View Post
                        Whoa, did something get tweaked with latest Reunion maintenance update (build 201110)? Because my source searching just improved about a zillion per cent.
                        Yes, we addressed this issue in yesterday's update. Happy to hear that you noticed significant improvement!

                        Frank Leister
                        Leister Productions Inc.

                        Comment


                          #13
                          Fantastic! My source-searching improved quite a lot. Thanks for listening.
                          Reiner Sauer
                          SauerRL@me.com ? info@reunion-de.de
                          Web: http://www.schevenhuette.com
                          Web: http://www.reunion-de.de

                          Comment


                            #14
                            Maybe a bit late to the game - but thank you for improving the Source search functionality. I won't say search has become instantaneous, but searching inside 82,000 sources has now become feasible. Searching for unused sources still takes some time, but I don't use that so often.
                            --
                            Eric Van Beest
                            Spring, TX

                            Researching: Van Beest, Feijen, Van Herk

                            Comment

                            Working...
                            X