AnsweredAssumed Answered

Avoiding Crawling Loops for "Includes"

Question asked by Micheal Stephenson on Jun 30, 2018
Latest reply on Jul 3, 2018 by Micheal Stephenson

We have to scan sites that we don't know the structure of quite often. When we're reviewing the crawl it seems quite often like there are loops, or maybe references to resource directories that get called a lot with includes. The crawl path might look something like this. 

 

site root/calendar/maincalendar/includes/calendarl/maincalendar/includes.. and so on it will stop several itterations deep. 

 

Is there a good way to prevent these loops using something like redundant links or blacklists safely while not know the exact use for these files or directories? 

Outcomes